书生开源大模型训练营-第4讲-笔记
1、FineTune简介
1.1、为什么要微调?大语言模型有各行各业的通用知识,但具体深入到某个领域,模型表现不尽如人意,需要微调
1.2、两种微调:增量预训练和指令微调
1.4、增量预训练:给模型投喂额外的特定领域的知识语料,模型在新的语料上继续学习训练。
1.5、指令微调:基座模型学习到到是在须训练数据集上的一个语言分布,本不能理解问题的意图。所以需要有一些方法让基座模型理解人类的意图(指令),这种方法叫指令微调。
1.6、如何进行指令微调:使用指令微调模板,其中有三个角色,System、User和Asistant。在System中设置具体领域的背景信息和意图,User中设置需要回答的问题,Asistant中设置期望的答案。有很多指令微调的框架,可以简化我们的工作,XTuner就是其中之一。
1.7、不同的开源框架有不同的微调框架。Llama和InternLM的格式有些不同。但都能在预测阶段自动的模板组装都是有微调框架完成的。
1.8、指令微调时,需要准备输入输出数据,但只对输出数据/label计算损失。
1.9、增量预训练:和指令微调的一问一答的训练语料不一样,增量预训练只有答案,或者说只有陈述句。所以在语料编写时,System和User部分都设置为空。但计算损失时,和指令微调是一样的。
1.10、XTuner中使用的是LoRA和QLoRA
LLM中线下全连接层有大量的参数,如果全都要进行微调,将需要很大的显存和工作量,为了节省显存和计算量,可以搞一个旁路,以小的参数量来近似达成全量调整的效果。这个旁路由两个变换矩阵构成。
1.11、全量调整、LORA、QLORA的对比
2、XTuner介绍
2.1、XTuner:开源微调框架,支撑HuggingFace和ModelScope和多个开源大模型家族,包括Llama、通义千问、ChatGLM以及InternLM、最新的MoE模型;支撑多种GPU显卡,包括消费级显卡和数据中心级显卡
2.2、快速上手:a、pip安装xtuner,注意要指定版本;b、选择配置模板;c、一键训练。拷贝配置模板,修改模板参数,启动训练。
训练完成得到一个adapter文件。在进行预测时在加载底座模型的时候,还需要加载这个adapter文件。
2.4、xtuner还支撑工具类模型的对话。
2.5、xtuner有强大的数据处理引擎,可以在不同格式的数据集上进行快速映射和启动训练,支撑将多条数据聚合成一条,以加速训练。建议使用json或JsonL格式
3、8G显卡玩转LLM
3.1、xtuner默认开启flashattention加速方式。xtuner默认ZeRO是不启动的。flash attention能大幅提高训练性能,但需要修改模板。
4、实战
4.1、进入到第三讲中已经建立好的开发机中:
4.2、安装
# 如果你是在其他平台: conda create --name xtuner0.1.9 python=3.10 -y # 激活环境 conda activate xtuner0.1.9 # 进入家目录 (~的意思是 “当前用户的home路径”) cd ~ # 创建版本文件夹并进入,以跟随本教程 mkdir xtuner019 && cd xtuner019 # 无法访问github的用户请从 gitee 拉取: git clone -b v0.1.9 https://gitee.com/Internlm/xtuner # 进入源码目录 cd xtuner # 从源码安装 XTuner pip install -e '.[all]'
屏幕输出:
(base) root@intern-studio-069640:~# conda activate xtuner0.1.9 (xtuner0.1.9) root@intern-studio-069640:~# # 进入家目录 (~的意思是 “当前用户的home路径”) (xtuner0.1.9) root@intern-studio-069640:~# cd ~ (xtuner0.1.9) root@intern-studio-069640:~# # 创建版本文件夹并进入,以跟随本教程 (xtuner0.1.9) root@intern-studio-069640:~# mkdir xtuner019 && cd xtuner019 (xtuner0.1.9) root@intern-studio-069640:~/xtuner019# (xtuner0.1.9) root@intern-studio-069640:~/xtuner019# git clone -b v0.1.9 https://gitee.com/Internlm/xtuner Cloning into 'xtuner'... remote: Enumerating objects: 6342, done. remote: Counting objects: 100% (3757/3757), done. remote: Compressing objects: 100% (747/747), done. remote: Total 6342 (delta 3080), reused 3614 (delta 2964), pack-reused 2585 Receiving objects: 100% (6342/6342), 1.14 MiB | 692.00 KiB/s, done. Resolving deltas: 100% (4901/4901), done. Note: switching to '9f686f08c8e60e568e811aaad8daf9c08462d42d'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example: git switch -c <new-branch-name> Or undo this operation with: git switch - Turn off this advice by setting config variable advice.detachedHead to false Updating files: 100% (430/430), done. (xtuner0.1.9) root@intern-studio-069640:~/xtuner019# # 进入源码目录 (xtuner0.1.9) root@intern-studio-069640:~/xtuner019# cd xtuner (xtuner0.1.9) root@intern-studio-069640:~/xtuner019/xtuner# (xtuner0.1.9) root@intern-studio-069640:~/xtuner019/xtuner# # 从源码安装 XTuner (xtuner0.1.9) root@intern-studio-069640:~/xtuner019/xtuner# pip install -e '.[all]' Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Obtaining file:///root/xtuner019/xtuner Preparing metadata (setup.py) ... done Collecting bitsandbytes>=0.40.0 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9b/63/489ef9cd7a33c1f08f1b2be51d1b511883c5e34591aaa9873b30021cd679/bitsandbytes-0.42.0-py3-none-any.whl (105.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 105.0/105.0 MB 43.2 MB/s eta 0:00:00 Collecting datasets Downloading https://pypi.tuna.tsinghua.edu.cn/packages/74/4d/63b033169534f0742b7fe13957118cae08c83b04bfde46511f397872e2e7/datasets-2.17.0-py3-none-any.whl (536 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.6/536.6 kB 12.4 MB/s eta 0:00:00 Collecting einops Using cached https://pypi.tuna.tsinghua.edu.cn/packages/29/0b/2d1c0ebfd092e25935b86509a9a817159212d82aa43d7fb07eca4eeff2c2/einops-0.7.0-py3-none-any.whl (44 kB) Collecting fsspec<=2023.6.0 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e3/bd/4c0a4619494188a9db5d77e2100ab7d544a42e76b2447869d8e124e981d8/fsspec-2023.6.0-py3-none-any.whl (163 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.8/163.8 kB 5.9 MB/s eta 0:00:00 Collecting lagent>=0.1.2 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/54/51/0cd9df1ec309b9d73e2a009bf61a8d8c84c34b27480994fe83a7fa8f24d3/lagent-0.2.1-py3-none-any.whl (69 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 69.4/69.4 kB 3.2 MB/s eta 0:00:00 Collecting mmengine>=0.9.1 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/92/f8/0ec23b2d7fd2d3aebe05a70b8b4ff314c0cb552a614b1656ca1cb2a11633/mmengine-0.10.3-py3-none-any.whl (451 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 451.7/451.7 kB 23.9 MB/s eta 0:00:00 Collecting modelscope Downloading https://pypi.tuna.tsinghua.edu.cn/packages/32/7f/5e49028db40c58a0ecea4f5a6ead189294353b793bb403d233b00cb35ac7/modelscope-1.12.0-py3-none-any.whl (5.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.6/5.6 MB 67.3 MB/s eta 0:00:00 Collecting peft>=0.4.0 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/07/63/168af5aa8dbda9c23ad774a4c1d311cfe220c634e0d05a3a82a7cae01bd8/peft-0.8.2-py3-none-any.whl (183 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 183.4/183.4 kB 8.9 MB/s eta 0:00:00 Collecting scipy Using cached https://pypi.tuna.tsinghua.edu.cn/packages/f5/aa/8e6071a5e4dca4ec68b5b22e4991ee74c59c5d372112b9c236ec1faff57d/scipy-1.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (38.4 MB) Collecting SentencePiece Using cached https://pypi.tuna.tsinghua.edu.cn/packages/7f/e5/323dc813b3e1339305f888d035e2f3725084fc4dcf051995b366dd26cc90/sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB) Collecting tiktoken Using cached https://pypi.tuna.tsinghua.edu.cn/packages/16/05/5efbd91252ffb1301ea393d88ef736b33d41e75d4bcf0bd31d660050e400/tiktoken-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB) Collecting torch Downloading https://pypi.tuna.tsinghua.edu.cn/packages/8c/67/fcc9b9e2369a9bae4da492aedc0c2dfa95d563ef0eaa9228b70c98395ec2/torch-2.2.0-cp310-cp310-manylinux1_x86_64.whl (755.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 755.5/755.5 MB 12.6 MB/s eta 0:00:00 Collecting transformers<=4.34.0,>=4.32.1 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1a/d1/3bba59606141ae808017f6fde91453882f931957f125009417b87a281067/transformers-4.34.0-py3-none-any.whl (7.7 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.7/7.7 MB 38.3 MB/s eta 0:00:00 Collecting transformers_stream_generator Downloading https://pypi.tuna.tsinghua.edu.cn/packages/36/26/3492ab0e45d814533b34ca605f8a20fdc032736f937679c6f212d81a76a5/transformers-stream-generator-0.0.4.tar.gz (12 kB) Preparing metadata (setup.py) ... done Collecting deepspeed>=0.12.3 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/5d/4b/382b6c7f22a9f51875e5a159a2a8e94c2b3b01b0c86f7bed2ea7cf919549/deepspeed-0.13.2.tar.gz (1.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 28.1 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting mpi4py-mpich Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1a/e3/942a8e3322e3f1a265409d4028843c2770864f9ee699ba692296aa743232/mpi4py_mpich-3.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.0/6.0 MB 40.8 MB/s eta 0:00:00 Collecting hjson (from deepspeed>=0.12.3) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1f/7f/13cd798d180af4bf4c0ceddeefba2b864a63c71645abc0308b768d67bb81/hjson-3.1.0-py3-none-any.whl (54 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.0/54.0 kB 2.9 MB/s eta 0:00:00 Collecting ninja (from deepspeed>=0.12.3) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6d/92/8d7aebd4430ab5ff65df2bfee6d5745f95c004284db2d8ca76dcbfd9de47/ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl (307 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 307.2/307.2 kB 9.0 MB/s eta 0:00:00 Collecting numpy (from deepspeed>=0.12.3) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/4b/d7/ecf66c1cd12dc28b4040b15ab4d17b773b87fa9d29ca16125de01adb36cd/numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 62.6 MB/s eta 0:00:00 Collecting packaging>=20.0 (from deepspeed>=0.12.3) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ec/1a/610693ac4ee14fcdf2d9bf3c493370e4f2ef7ae2e19217d7a237ff42367d/packaging-23.2-py3-none-any.whl (53 kB) Collecting psutil (from deepspeed>=0.12.3) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c5/4f/0e22aaa246f96d6ac87fe5ebb9c5a693fbe8877f537a1022527c47ca43c5/psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB) Collecting py-cpuinfo (from deepspeed>=0.12.3) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e0/a9/023730ba63db1e494a271cb018dcd361bd2c917ba7004c3e49d5daf795a2/py_cpuinfo-9.0.0-py3-none-any.whl (22 kB) Collecting pydantic (from deepspeed>=0.12.3) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/db/dc/afecbd9650f486889181c6d1a0d675b580c06253ea7e304588e4c7485bdb/pydantic-2.6.1-py3-none-any.whl (394 kB) Collecting pynvml (from deepspeed>=0.12.3) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/5b/9c/adb8070059caaa15d5a572b66bccd95900d8c1b9fa54d6ecea6ae97448d1/pynvml-11.5.0-py3-none-any.whl (53 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.1/53.1 kB 1.0 MB/s eta 0:00:00 Collecting tqdm (from deepspeed>=0.12.3) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/2a/14/e75e52d521442e2fcc9f1df3c5e456aead034203d4797867980de558ab34/tqdm-4.66.2-py3-none-any.whl (78 kB) Collecting arxiv (from lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/99/16/532c2aa4bc83b2356820efd4d1f619e45178dc3a0dc0cde16fbccdc43fc1/arxiv-2.1.0-py3-none-any.whl (11 kB) Collecting distro (from lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/12/b3/231ffd4ab1fc9d679809f356cebee130ac7daa00d6d6f3206dd4fd137e9e/distro-1.9.0-py3-none-any.whl (20 kB) Collecting func-timeout (from lagent>=0.1.2) Using cached func_timeout-4.3.5-py3-none-any.whl Collecting google-search-results (from lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/77/30/b3a6f6a2e00f8153549c2fa345c58ae1ce8e5f3153c2fe0484d444c3abcb/google_search_results-2.4.2.tar.gz (18 kB) Preparing metadata (setup.py) ... done Collecting griffe (from lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/aa/4c/7268d218ee38cb0e07d63fc3fe60fe19dc353f757db3d365f0b5ffba85be/griffe-0.40.1-py3-none-any.whl (116 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.9/116.9 kB 2.9 MB/s eta 0:00:00 Collecting json5 (from lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/70/ba/fa37123a86ae8287d6678535a944f9c3377d8165e536310ed6f6cb0f0c0e/json5-0.9.14-py2.py3-none-any.whl (19 kB) Collecting jsonschema (from lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/39/9d/b035d024c62c85f2e2d4806a59ca7b8520307f34e0932fbc8cc75fe7b2d9/jsonschema-4.21.1-py3-none-any.whl (85 kB) Collecting jupyter (from lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/83/df/0f5dd132200728a86190397e1ea87cd76244e42d39ec5e88efd25b2abd7e/jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB) Collecting jupyter-client (from lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/43/ae/5f4f72980765e2e5e02b260f9c53bcc706cefa7ac9c8d7240225c55788d4/jupyter_client-8.6.0-py3-none-any.whl (105 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 105.9/105.9 kB 2.6 MB/s eta 0:00:00 Collecting phx-class-registry (from lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9b/46/02f4f5fb40f5ccbb3fc23a328fb3314843375d050a3b40ec21a8c18b5762/phx_class_registry-4.1.0-py3-none-any.whl (13 kB) Collecting pillow (from lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/cb/c3/98faa3e92cf866b9446c4842f1fe847e672b2f54e000cb984157b8095797/pillow-10.2.0-cp310-cp310-manylinux_2_28_x86_64.whl (4.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 44.6 MB/s eta 0:00:00 Collecting python-pptx (from lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/72/49/6eee83072983473e9905ffddd5c2032b9a0ca4616425560d6d582287b467/python_pptx-0.6.23-py3-none-any.whl (471 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 471.6/471.6 kB 21.5 MB/s eta 0:00:00 Collecting requests (from lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/70/8e/0e2d847013cb52cd35b38c009bb167a1a26b2ce6cd6965bf26b47bc0bf44/requests-2.31.0-py3-none-any.whl (62 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 kB 3.4 MB/s eta 0:00:00 Collecting timeout-decorator (from lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/80/f8/0802dd14c58b5d3d72bb9caa4315535f58787a1dc50b81bbbcaaa15451be/timeout-decorator-0.5.0.tar.gz (4.8 kB) Preparing metadata (setup.py) ... done Collecting typing-extensions (from lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b7/f4/6a90020cd2d93349b442bfcb657d0dc91eee65491600b2cb1d388bc98e6b/typing_extensions-4.9.0-py3-none-any.whl (32 kB) Collecting addict (from mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/6a/00/b08f23b7d7e1e14ce01419a467b583edbb93c6cdb8654e54a9cc579cd61f/addict-2.4.0-py3-none-any.whl (3.8 kB) Collecting matplotlib (from mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c1/f2/325897d6c498278b0f8b460d44b516f5db865ddb4ba9018e9fe58a3e4633/matplotlib-3.8.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB) Collecting pyyaml (from mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/29/61/bf33c6c85c55bc45a29eee3195848ff2d518d84735eb0e2d8cb42e0d285e/PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (705 kB) Collecting rich (from mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/be/be/1520178fa01eabe014b16e72a952b9f900631142ccd03dc36cf93e30c1ce/rich-13.7.0-py3-none-any.whl (240 kB) Collecting termcolor (from mmengine>=0.9.1) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d9/5f/8c716e47b3a50cbd7c146f45881e11d9414def768b7cd9c5e6650ec2a80a/termcolor-2.4.0-py3-none-any.whl (7.7 kB) Collecting yapf (from mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/66/c9/d4b03b2490107f13ebd68fe9496d41ae41a7de6275ead56d0d4621b11ffd/yapf-0.40.2-py3-none-any.whl (254 kB) Collecting opencv-python>=3 (from mmengine>=0.9.1) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d9/64/7fdfb9386511cd6805451e012c537073a79a958a58795c4e602e538c388c/opencv_python-4.9.0.80-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.2/62.2 MB 52.6 MB/s eta 0:00:00 Collecting accelerate>=0.21.0 (from peft>=0.4.0) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1b/da/24a54b9205fce3bdbaad521c35944d0b0a2d292ac5ae921e484b76312b43/accelerate-0.27.2-py3-none-any.whl (279 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 280.0/280.0 kB 24.4 MB/s eta 0:00:00 Collecting safetensors (from peft>=0.4.0) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d0/ba/b2254fafc7f5fdc98a2fa4d5a5eeb029fbf9589ec87f2c230c3ac0a1dd53/safetensors-0.4.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB) Collecting huggingface-hub>=0.17.0 (from peft>=0.4.0) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/28/03/7d3c7153113ec59cfb31e3b8ee773f5f420a0dd7d26d40442542b96675c3/huggingface_hub-0.20.3-py3-none-any.whl (330 kB) Collecting filelock (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/81/54/84d42a0bee35edba99dee7b59a8d4970eccdd44b99fe728ed912106fc781/filelock-3.13.1-py3-none-any.whl (11 kB) Collecting sympy (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d2/05/e6600db80270777c4a64238a98d442f0fd07cc8915be2a1c16da7f2b9e74/sympy-1.12-py3-none-any.whl (5.7 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 60.9 MB/s eta 0:00:00 Collecting networkx (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d5/f0/8fbc882ca80cf077f1b246c0e3c3465f7f415439bdea6b899f6b19f61f70/networkx-3.2.1-py3-none-any.whl (1.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 24.8 MB/s eta 0:00:00 Collecting jinja2 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/30/6d/6de6be2d02603ab56e72997708809e8a5b0fbfee080735109b40a3564843/Jinja2-3.1.3-py3-none-any.whl (133 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.2/133.2 kB 7.4 MB/s eta 0:00:00 Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/b6/9f/c64c03f49d6fbc56196664d05dba14e3a561038a81a638eeb47f4d4cfd48/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 55.7 MB/s eta 0:00:00 Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/eb/d5/c68b1d2cdfcc59e72e8a5949a37ddb22ae6cade80cd4a57a84d4c8b55472/nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 22.7 MB/s eta 0:00:00 Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7e/00/6b218edd739ecfc60524e585ba8e6b00554dd908de2c9c66c1af3e44e18d/nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 59.2 MB/s eta 0:00:00 Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ff/74/a2e2be7fb83aaedec84f391f082cf765dfb635e7caa9b49065f73e4835d8/nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 11.6 MB/s eta 0:00:00 Collecting nvidia-cublas-cu12==12.1.3.1 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/37/6d/121efd7382d5b0284239f4ab1fc1590d86d34ed4a4a2fdb13b30ca8e5740/nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 14.8 MB/s eta 0:00:00 Collecting nvidia-cufft-cu12==11.0.2.54 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/86/94/eb540db023ce1d162e7bea9f8f5aa781d57c65aed513c33ee9a5123ead4d/nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 19.8 MB/s eta 0:00:00 Collecting nvidia-curand-cu12==10.3.2.106 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/44/31/4890b1c9abc496303412947fc7dcea3d14861720642b49e8ceed89636705/nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 20.3 MB/s eta 0:00:00 Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/bc/1d/8de1e5c67099015c834315e333911273a8c6aaba78923dd1d1e25fc5f217/nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 37.6 MB/s eta 0:00:00 Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/65/5b/cfaeebf25cd9fdec14338ccb16f6b2c4c7fa9163aefcf057d86b9cc248bb/nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 17.7 MB/s eta 0:00:00 Collecting nvidia-nccl-cu12==2.19.3 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/38/00/d0d4e48aef772ad5aebcf70b73028f88db6e5640b36c38e90445b7a57c45/nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl (166.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.0/166.0 MB 18.6 MB/s eta 0:00:00 Collecting nvidia-nvtx-cu12==12.1.105 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/da/d3/8057f0587683ed2fcd4dbfbdfdfa807b9160b809976099d36b8f60d08f03/nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 kB 6.1 MB/s eta 0:00:00 Collecting triton==2.2.0 (from torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/95/05/ed974ce87fe8c8843855daa2136b3409ee1c126707ab54a8b72815c08b49/triton-2.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (167.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 167.9/167.9 MB 22.2 MB/s eta 0:00:00 Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1e/07/bf730d44c2fe1b676ad9cc2be5f5f861eb5d153fb6951987a2d6a96379a9/nvidia_nvjitlink_cu12-12.3.101-py3-none-manylinux1_x86_64.whl (20.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.5/20.5 MB 32.4 MB/s eta 0:00:00 Collecting regex!=2019.12.17 (from transformers<=4.34.0,>=4.32.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/81/8a/96a62ce98e8ff1b16db56fde3debc8a571f6b7ea42ee137eb0d995cdfa26/regex-2023.12.25-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (773 kB) Collecting tokenizers<0.15,>=0.14 (from transformers<=4.34.0,>=4.32.1) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a7/7b/c1f643eb086b6c5c33eef0c3752e37624bd23e4cbc9f1332748f1c6252d1/tokenizers-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.8/3.8 MB 37.8 MB/s eta 0:00:00 Collecting pyarrow>=12.0.0 (from datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d4/ca/ef67abb77f9dd51a0d3ff7fcebff58296068a046d7da352b9548070005ed/pyarrow-15.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (38.3 MB) Collecting pyarrow-hotfix (from datasets) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e4/f4/9ec2222f5f5f8ea04f66f184caafd991a39c8782e31f5b0266f101cb68ca/pyarrow_hotfix-0.6-py3-none-any.whl (7.9 kB) Collecting dill<0.3.9,>=0.3.0 (from datasets) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c9/7a/cef76fd8438a42f96db64ddaa85280485a9c395e7df3db8158cfec1eee34/dill-0.3.8-py3-none-any.whl (116 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.3/116.3 kB 4.4 MB/s eta 0:00:00 Collecting pandas (from datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b3/b3/3102c3a4abca1093e50cfec2213102a1c65c0b318a4431395d0121e6e690/pandas-2.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.0 MB) Collecting xxhash (from datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/80/8a/1dd41557883b6196f8f092011a5c1f72d4d44cf36d7b67d4a5efe3127949/xxhash-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB) Collecting multiprocess (from datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/bc/f7/7ec7fddc92e50714ea3745631f79bd9c96424cb2702632521028e57d3a36/multiprocess-0.70.16-py310-none-any.whl (134 kB) Collecting aiohttp (from datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/93/40/d3decda219ebd5410eba627601d537ec3782efbcadba308e9ce381cc0b71/aiohttp-3.9.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB) Collecting attrs (from modelscope) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/e0/44/827b2a91a5816512fcaf3cc4ebc465ccd5d598c45cefa6703fcf4a79018f/attrs-23.2.0-py3-none-any.whl (60 kB) Collecting gast>=0.2.2 (from modelscope) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/fa/39/5aae571e5a5f4de9c3445dae08a530498e5c53b0e74410eeeb0991c79047/gast-0.5.4-py3-none-any.whl (19 kB) Collecting oss2 (from modelscope) Using cached oss2-2.18.4-py3-none-any.whl Collecting python-dateutil>=2.1 (from modelscope) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/36/7a/87837f39d0296e723bb9b62bbb257d0355c7f6128853c78955f57342a56d/python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) Requirement already satisfied: setuptools in /root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages (from modelscope) (68.2.2) Collecting simplejson>=3.3.0 (from modelscope) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/cb/b6/ed513a0adc3e2c9654864ffb68266dcab5720d5653428d690e7e4fb32a6c/simplejson-3.19.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (137 kB) Collecting sortedcontainers>=1.5.9 (from modelscope) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/32/46/9cb0e58b2deb7f82b84065f37f3bffeb12413f947f9388e4cac22c4621ce/sortedcontainers-2.4.0-py2.py3-none-any.whl (29 kB) Collecting urllib3>=1.26 (from modelscope) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/88/75/311454fd3317aefe18415f04568edc20218453b709c63c58b9292c71be17/urllib3-2.2.0-py3-none-any.whl (120 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 120.9/120.9 kB 7.3 MB/s eta 0:00:00 WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', RemoteDisconnected('Remote end closed connection without response'))': /simple/aiosignal/ Collecting aiosignal>=1.1.2 (from aiohttp->datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/76/ac/a7305707cb852b7e16ff80eaf5692309bde30e2b1100a1fcacdc8f731d97/aiosignal-1.3.1-py3-none-any.whl (7.6 kB) Collecting frozenlist>=1.1.1 (from aiohttp->datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ec/25/0c87df2e53c0c5d90f7517ca0ff7aca78d050a8ec4d32c4278e8c0e52e51/frozenlist-1.4.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (239 kB) Collecting multidict<7.0,>=4.5 (from aiohttp->datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/33/62/2c9085e571318d51212a6914566fe41dd0e33d7f268f7e2f23dcd3f06c56/multidict-6.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (124 kB) Collecting yarl<2.0,>=1.0 (from aiohttp->datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c3/a0/0ade1409d184cbc9e85acd403a386a7c0563b92ff0f26d138ff9e86e48b4/yarl-1.9.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (301 kB) Collecting async-timeout<5.0,>=4.0 (from aiohttp->datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/a7/fa/e01228c2938de91d47b307831c62ab9e4001e747789d0b05baf779a6488c/async_timeout-4.0.3-py3-none-any.whl (5.7 kB) Collecting six>=1.5 (from python-dateutil>=2.1->modelscope) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl (11 kB) Collecting charset-normalizer<4,>=2 (from requests->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/da/f1/3702ba2a7470666a62fd81c58a4c40be00670e5006a67f4d626e57f013ae/charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (142 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 142.1/142.1 kB 2.0 MB/s eta 0:00:00 Collecting idna<4,>=2.5 (from requests->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c2/e7/a82b05cf63a603df6e68d59ae6a68bf5064484a0718ea5033660af4b54a9/idna-3.6-py3-none-any.whl (61 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.6/61.6 kB 1.6 MB/s eta 0:00:00 Collecting certifi>=2017.4.17 (from requests->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ba/06/a07f096c664aeb9f01624f858c3add0a4e913d6c96257acb4fce61e7de14/certifi-2024.2.2-py3-none-any.whl (163 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.8/163.8 kB 3.8 MB/s eta 0:00:00 INFO: pip is looking at multiple versions of tokenizers to determine which version is compatible with other requirements. This could take a while. Collecting tokenizers<0.15,>=0.14 (from transformers<=4.34.0,>=4.32.1) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/57/bd/45b5ef6b088880779f70acf60027f7043ca5fa1b98f4a4345cf3aea09044/tokenizers-0.14.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.8/3.8 MB 20.6 MB/s eta 0:00:00 Collecting accelerate>=0.21.0 (from peft>=0.4.0) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e0/e5/20373eaee15adeb12872bc03355636c283cf3092fd7eb290bb974174b14e/accelerate-0.27.1-py3-none-any.whl (279 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 279.7/279.7 kB 5.5 MB/s eta 0:00:00 Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c8/14/73c3d62e709c2ace755c826997b12f883f3cb6b138dec63ac1e2a68cd910/accelerate-0.27.0-py3-none-any.whl (279 kB) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a6/b9/44623bdb05595481107153182e7f4b9f2ef9d3b674938ad13842054dcbd8/accelerate-0.26.1-py3-none-any.whl (270 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 270.9/270.9 kB 7.9 MB/s eta 0:00:00 INFO: pip is still looking at multiple versions of tokenizers to determine which version is compatible with other requirements. This could take a while. Downloading https://pypi.tuna.tsinghua.edu.cn/packages/63/9c/c10fc10df1d4968406b3f3cffe5a7d9988a8583e3423fc4156d6c91ab62d/accelerate-0.26.0-py3-none-any.whl (270 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 270.7/270.7 kB 4.5 MB/s eta 0:00:00 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f7/fc/c55e5a2da345c9a24aa2e1e0f60eb2ca290b6a41be82da03a6d4baec4f99/accelerate-0.25.0-py3-none-any.whl (265 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 265.7/265.7 kB 4.8 MB/s eta 0:00:00 Using cached https://pypi.tuna.tsinghua.edu.cn/packages/13/9e/ee987874058f2d93006961f6ff49e0bcb60ab9c26709ebe06bfa8707a4d8/accelerate-0.24.1-py3-none-any.whl (261 kB) INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C. Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d0/cf/364d550af711b5abe5129ac676896b223ba5a082d97fe400527a59c0c1f8/accelerate-0.24.0-py3-none-any.whl (260 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 261.0/261.0 kB 8.2 MB/s eta 0:00:00 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d9/92/2d3aecf9f4a192968035880be3e2fc8b48d541c7128f7c936f430d6f96da/accelerate-0.23.0-py3-none-any.whl (258 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 258.1/258.1 kB 10.2 MB/s eta 0:00:00 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/4d/a7/05c67003d659a0035f2b3a8cf389c1d9645865aee84a73ce99ddab16682f/accelerate-0.22.0-py3-none-any.whl (251 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 251.2/251.2 kB 15.0 MB/s eta 0:00:00 Collecting transformers_stream_generator Downloading https://pypi.tuna.tsinghua.edu.cn/packages/bf/e8/785ec1627a60ca0ae7934525d2a24f419f146ff98b719f30ac76ced4fed4/transformers-stream-generator-0.0.3.tar.gz (12 kB) Preparing metadata (setup.py) ... done Collecting modelscope Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1d/1c/b40d3558879309e5b080e3f2eaaac016385487671508c362245bfd5e4cdf/modelscope-1.11.1-py3-none-any.whl (5.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.5/5.5 MB 66.6 MB/s eta 0:00:00 Collecting datasets Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ec/93/454ada0d1b289a0f4a86ac88dbdeab54921becabac45da3da787d136628f/datasets-2.16.1-py3-none-any.whl (507 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 507.1/507.1 kB 9.4 MB/s eta 0:00:00 Collecting dill<0.3.8,>=0.3.0 (from datasets) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f5/3a/74a29b11cf2cdfcd6ba89c0cecd70b37cd1ba7b77978ce611eb7a146a832/dill-0.3.7-py3-none-any.whl (115 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 115.3/115.3 kB 2.6 MB/s eta 0:00:00 Collecting datasets Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a0/93/da8a22a292e51ab76f969eb87bda8fd70cc3963b4dd71f67bb92a70a7992/datasets-2.16.0-py3-none-any.whl (507 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 507.1/507.1 kB 21.0 MB/s eta 0:00:00 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e2/cf/db41e572d7ed958e8679018f8190438ef700aeb501b62da9e1eed9e4d69a/datasets-2.15.0-py3-none-any.whl (521 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 521.2/521.2 kB 7.9 MB/s eta 0:00:00 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/00/23/80a2147a547cb2fd59eb92a13787c849b3efaefcea02a5c963dfc93f7c56/datasets-2.14.7-py3-none-any.whl (520 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 520.4/520.4 kB 7.2 MB/s eta 0:00:00 Collecting huggingface-hub>=0.17.0 (from peft>=0.4.0) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/aa/f3/3fc97336a0e90516901befd4f500f08d691034d387406fdbde85bea827cc/huggingface_hub-0.17.3-py3-none-any.whl (295 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 295.0/295.0 kB 8.2 MB/s eta 0:00:00 Collecting feedparser==6.0.10 (from arxiv->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/92/1e/741fd94cf2855d251712868f2183cb6485a28daaa3947e1a7046dc036aca/feedparser-6.0.10-py3-none-any.whl (81 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 81.1/81.1 kB 4.2 MB/s eta 0:00:00 Collecting sgmllib3k (from feedparser==6.0.10->arxiv->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9e/bd/3704a8c3e0942d711c1299ebf7b9091930adae6675d7c8f476a7ce48653c/sgmllib3k-1.0.0.tar.gz (5.8 kB) Preparing metadata (setup.py) ... done Collecting colorama>=0.4 (from griffe->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl (25 kB) Collecting MarkupSafe>=2.0 (from jinja2->torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7c/52/2b1b570f6b8b803cef5ac28fdf78c0da318916c7d2fe9402a84d591b394c/MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB) Collecting jsonschema-specifications>=2023.03.6 (from jsonschema->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ee/07/44bd408781594c4d0a027666ef27fab1e441b109dc3b76b4f836f8fd04fe/jsonschema_specifications-2023.12.1-py3-none-any.whl (18 kB) Collecting referencing>=0.28.4 (from jsonschema->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/90/10/1c92edb0a0a14b67ff825bc338e74bc49ab27d3f3bae3f9a02838cba546f/referencing-0.33.0-py3-none-any.whl (26 kB) Collecting rpds-py>=0.7.1 (from jsonschema->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/15/f5/769fc90b3af55e6288ce683539ffd68b93dbdf1a5d86050f063828e5911e/rpds_py-0.18.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB) Collecting notebook (from jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/5f/38/f5a11c1e68bf3dbd54c7c98f301bf9495e8735803b42ee2f740c5b7c1ca5/notebook-7.1.0-py3-none-any.whl (5.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/5.0 MB 59.5 MB/s eta 0:00:00 Collecting qtconsole (from jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d5/21/0887c50fa5bca7bfde29f65999a6ac234617f2a007b6b387aa4dc0ca36a8/qtconsole-5.5.1-py3-none-any.whl (123 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 123.4/123.4 kB 6.7 MB/s eta 0:00:00 Collecting jupyter-console (from jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ca/77/71d78d58f15c22db16328a476426f7ac4a60d3a5a7ba3b9627ee2f7903d4/jupyter_console-6.6.3-py3-none-any.whl (24 kB) Collecting nbconvert (from jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c9/ec/c120b21e7f884a701e12a241992754e719adaf430d0d6b30c6655776bc35/nbconvert-7.16.0-py3-none-any.whl (257 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 257.2/257.2 kB 10.9 MB/s eta 0:00:00 Collecting ipykernel (from jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/16/9a/0c7b514c73b42cf4ce516ee26c8940a0b23a9754dafaa459a939220240fd/ipykernel-6.29.2-py3-none-any.whl (116 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.1/116.1 kB 4.0 MB/s eta 0:00:00 Collecting ipywidgets (from jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/70/1a/7edeedb1c089d63ccd8bd5c0612334774e90cf9337de9fe6c82d90081791/ipywidgets-8.1.2-py3-none-any.whl (139 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.4/139.4 kB 4.1 MB/s eta 0:00:00 Collecting jupyter-core!=5.0.*,>=4.12 (from jupyter-client->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/86/a1/354cade6907f2fbbd32d89872ec64b62406028e7645ac13acfdb5732829e/jupyter_core-5.7.1-py3-none-any.whl (28 kB) Collecting pyzmq>=23.0 (from jupyter-client->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/67/bf/6bc0977acd934b66eacab79cec303ecf08ae4a6150d57c628aa919615488/pyzmq-25.1.2-cp310-cp310-manylinux_2_28_x86_64.whl (1.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 22.1 MB/s eta 0:00:00 Collecting tornado>=6.2 (from jupyter-client->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/9f/12/11d0a757bb67278d3380d41955ae98527d5ad18330b2edbdc8de222b569b/tornado-6.4-cp38-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (435 kB) Collecting traitlets>=5.3 (from jupyter-client->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/45/34/5dc77fdc7bb4bd198317eea5679edf9cc0a186438b5b19dbb9062fb0f4d5/traitlets-5.14.1-py3-none-any.whl (85 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 85.4/85.4 kB 3.9 MB/s eta 0:00:00 Collecting contourpy>=1.0.1 (from matplotlib->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/58/56/e2c43dcfa1f9c7db4d5e3d6f5134b24ed953f4e2133a4b12f0062148db58/contourpy-1.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (310 kB) Collecting cycler>=0.10 (from matplotlib->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/e7/05/c19819d5e3d95294a6f5947fb9b9629efb316b96de511b418c53d245aae6/cycler-0.12.1-py3-none-any.whl (8.3 kB) Collecting fonttools>=4.22.0 (from matplotlib->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/a6/ba/5eac3e9c9bbc2dea3606e46de08bcef0908d74e7ccf89a71701b95a16747/fonttools-4.49.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.6 MB) Collecting kiwisolver>=1.3.1 (from matplotlib->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/6f/40/4ab1fdb57fced80ce5903f04ae1aed7c1d5939dda4fd0c0aa526c12fe28a/kiwisolver-1.4.5-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.6 MB) Collecting pyparsing>=2.3.1 (from matplotlib->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/39/92/8486ede85fcc088f1b3dba4ce92dd29d126fd96b0008ea213167940a2475/pyparsing-3.1.1-py3-none-any.whl (103 kB) INFO: pip is looking at multiple versions of multiprocess to determine which version is compatible with other requirements. This could take a while. Collecting multiprocess (from datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/35/a8/36d8d7b3e46b377800d8dec47891cdf05842d1a2366909ae4a0c89fbc5e6/multiprocess-0.70.15-py310-none-any.whl (134 kB) Collecting crcmod>=1.7 (from oss2->modelscope) Using cached crcmod-1.7-cp310-cp310-linux_x86_64.whl Collecting pycryptodome>=3.4.7 (from oss2->modelscope) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/af/20/5f29ec45462360e7f61e8688af9fe4a0afae057edfabdada662e11bf97e7/pycryptodome-3.20.0-cp35-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB) Collecting aliyun-python-sdk-kms>=2.4.1 (from oss2->modelscope) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/3d/ea/d88e08bfc4a0aee0111f1f24c98b19107bc6783441e7e944907c77b2243d/aliyun_python_sdk_kms-2.16.2-py2.py3-none-any.whl (94 kB) Collecting aliyun-python-sdk-core>=2.13.12 (from oss2->modelscope) Using cached aliyun_python_sdk_core-2.14.0-py3-none-any.whl Collecting pytz>=2020.1 (from pandas->datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/9c/3d/a121f284241f08268b21359bd425f7d4825cffc5ac5cd0e1b3d82ffd2b10/pytz-2024.1-py2.py3-none-any.whl (505 kB) Collecting tzdata>=2022.7 (from pandas->datasets) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/65/58/f9c9e6be752e9fcb8b6a0ee9fb87e6e7a1f6bcab2cdc73f02bb7ba91ada0/tzdata-2024.1-py2.py3-none-any.whl (345 kB) Collecting annotated-types>=0.4.0 (from pydantic->deepspeed>=0.12.3) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/28/78/d31230046e58c207284c6b2c4e8d96e6d3cb4e52354721b944d3e1ee4aa5/annotated_types-0.6.0-py3-none-any.whl (12 kB) Collecting pydantic-core==2.16.2 (from pydantic->deepspeed>=0.12.3) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/50/5e/2978d9f0e8d0cfd78e22115c028a41e0599e3d684e5aef7ed9bd18fcbd0c/pydantic_core-2.16.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.2 MB) Collecting lxml>=3.1.0 (from python-pptx->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/25/5c/979167df4ca5a1c308105bb1590412c54bd1b0baa1883212f39cb42d4fcd/lxml-5.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.0 MB) Collecting XlsxWriter>=0.5.7 (from python-pptx->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f7/3e/05ba2194cd5073602422859c949a4f21310a3c49bf8dccde9e03d4522b11/XlsxWriter-3.1.9-py3-none-any.whl (154 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 154.8/154.8 kB 8.8 MB/s eta 0:00:00 Collecting markdown-it-py>=2.2.0 (from rich->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/42/d7/1ec15b46af6af88f19b8e5ffea08fa375d433c998b8a7639e76935c14f1f/markdown_it_py-3.0.0-py3-none-any.whl (87 kB) Collecting pygments<3.0.0,>=2.13.0 (from rich->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/97/9c/372fef8377a6e340b1704768d20daaded98bf13282b5327beb2e2fe2c7ef/pygments-2.17.2-py3-none-any.whl (1.2 MB) Collecting mpmath>=0.19 (from sympy->torch) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/43/e3/7d92a15f894aa0c9c4b49b8ee9ac9850d6e63b03c9c32c0367a13ae62209/mpmath-1.3.0-py3-none-any.whl (536 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 23.3 MB/s eta 0:00:00 Collecting importlib-metadata>=6.6.0 (from yapf->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c0/8b/d8427f023c081a8303e6ac7209c16e6878f2765d5b59667f3903fbcfd365/importlib_metadata-7.0.1-py3-none-any.whl (23 kB) Collecting platformdirs>=3.5.1 (from yapf->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/55/72/4898c44ee9ea6f43396fbc23d9bfaf3d06e01b83698bdf2e4c919deceb7c/platformdirs-4.2.0-py3-none-any.whl (17 kB) Collecting tomli>=2.0.1 (from yapf->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/97/75/10a9ebee3fd790d20926a90a2547f0bf78f371b2f13aa822c759680ca7b9/tomli-2.0.1-py3-none-any.whl (12 kB) Collecting jmespath<1.0.0,>=0.9.3 (from aliyun-python-sdk-core>=2.13.12->oss2->modelscope) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/07/cb/5f001272b6faeb23c1c9e0acc04d48eaaf5c862c17709d20e3469c6e0139/jmespath-0.10.0-py2.py3-none-any.whl (24 kB) Collecting cryptography>=2.6.0 (from aliyun-python-sdk-core>=2.13.12->oss2->modelscope) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/4e/8a/a36f452b8cf725073521c8e7af664d85b337d699f29cb5845d92977af1ca/cryptography-42.0.3-cp39-abi3-manylinux_2_28_x86_64.whl (4.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.6/4.6 MB 69.1 MB/s eta 0:00:00 Collecting zipp>=0.5 (from importlib-metadata>=6.6.0->yapf->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d9/66/48866fc6b158c81cc2bfecc04c480f105c6040e8b077bc54c634b4a67926/zipp-3.17.0-py3-none-any.whl (7.4 kB) Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich->mmengine>=0.9.1) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b3/38/89ba8ad64ae25be8de66a6d463314cf1eb366222074cfda9ee839c56a4b4/mdurl-0.1.2-py3-none-any.whl (10.0 kB) Collecting comm>=0.1.1 (from ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6e/c1/e7335bd49aa3fa3bd453e34a4580b0076804f219897ad76d4d5aa4d8f22f/comm-0.2.1-py3-none-any.whl (7.2 kB) Collecting debugpy>=1.6.5 (from ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7a/27/78d5cf9c7aba43f8341e78273ab776913d2d33beb581ec39b65e56a0db77/debugpy-1.8.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 34.2 MB/s eta 0:00:00 Collecting ipython>=7.23.1 (from ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/fb/e7/07dc8b6541affd4de15f0e8fc855f238cb93d04c4f8490757226d12cdb5a/ipython-8.21.0-py3-none-any.whl (810 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 810.0/810.0 kB 18.4 MB/s eta 0:00:00 Collecting matplotlib-inline>=0.1 (from ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f2/51/c34d7a1d528efaae3d8ddb18ef45a41f284eacf9e514523b191b7d0872cc/matplotlib_inline-0.1.6-py3-none-any.whl (9.4 kB) Collecting nest-asyncio (from ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a0/c4/c2971a3ba4c6103a3d10c4b0f24f461ddc027f0f09763220cf35ca1401b3/nest_asyncio-1.6.0-py3-none-any.whl (5.2 kB) Collecting widgetsnbextension~=4.0.10 (from ipywidgets->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/99/bc/82a8c3985209ca7c0a61b383c80e015fd92e74f8ba0ec1af98f9d6ca8dce/widgetsnbextension-4.0.10-py3-none-any.whl (2.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 39.1 MB/s eta 0:00:00 Collecting jupyterlab-widgets~=3.0.10 (from ipywidgets->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/24/da/db1cb0387a7e4086780aff137987ee924e953d7f91b2a870f994b9b1eeb8/jupyterlab_widgets-3.0.10-py3-none-any.whl (215 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 215.0/215.0 kB 6.9 MB/s eta 0:00:00 Collecting prompt-toolkit>=3.0.30 (from jupyter-console->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ee/fd/ca7bf3869e7caa7a037e23078539467b433a4e01eebd93f77180ab927766/prompt_toolkit-3.0.43-py3-none-any.whl (386 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 386.1/386.1 kB 7.8 MB/s eta 0:00:00 Collecting beautifulsoup4 (from nbconvert->jupyter->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b1/fe/e8c672695b37eecc5cbf43e1d0638d88d66ba3a44c4d321c796f4e59167f/beautifulsoup4-4.12.3-py3-none-any.whl (147 kB) Collecting bleach!=5.0.0 (from nbconvert->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ea/63/da7237f805089ecc28a3f36bca6a21c31fcbc2eb380f3b8f1be3312abd14/bleach-6.1.0-py3-none-any.whl (162 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 162.8/162.8 kB 4.4 MB/s eta 0:00:00 Collecting defusedxml (from nbconvert->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/07/6c/aa3f2f849e01cb6a001cd8554a88d4c77c5c1a31c95bdf1cf9301e6d9ef4/defusedxml-0.7.1-py2.py3-none-any.whl (25 kB) Collecting jupyterlab-pygments (from nbconvert->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/b1/dd/ead9d8ea85bf202d90cc513b533f9c363121c7792674f78e0d8a854b63b4/jupyterlab_pygments-0.3.0-py3-none-any.whl (15 kB) Collecting mistune<4,>=2.0.3 (from nbconvert->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f0/74/c95adcdf032956d9ef6c89a9b8a5152bf73915f8c633f3e3d88d06bd699c/mistune-3.0.2-py3-none-any.whl (47 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.0/48.0 kB 3.4 MB/s eta 0:00:00 Collecting nbclient>=0.5.0 (from nbconvert->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6b/3a/607149974149f847125c38a62b9ea2b8267eb74823bbf8d8c54ae0212a00/nbclient-0.9.0-py3-none-any.whl (24 kB) Collecting nbformat>=5.7 (from nbconvert->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f4/e7/ef30a90b70eba39e675689b9eaaa92530a71d7435ab8f9cae520814e0caf/nbformat-5.9.2-py3-none-any.whl (77 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.6/77.6 kB 4.4 MB/s eta 0:00:00 Collecting pandocfilters>=1.4.1 (from nbconvert->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ef/af/4fbc8cab944db5d21b7e2a5b8e9211a03a79852b1157e2c102fcc61ac440/pandocfilters-1.5.1-py2.py3-none-any.whl (8.7 kB) Collecting tinycss2 (from nbconvert->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/da/99/fd23634d6962c2791fb8cb6ccae1f05dcbfc39bce36bba8b1c9a8d92eae8/tinycss2-1.2.1-py3-none-any.whl (21 kB) Collecting jupyter-server<3,>=2.4.0 (from notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/25/d6/6ee093c967d11144aeb1b0b4952d30e51da8eb2737837ab612084c783a58/jupyter_server-2.12.5-py3-none-any.whl (380 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 380.3/380.3 kB 14.2 MB/s eta 0:00:00 Collecting jupyterlab-server<3,>=2.22.1 (from notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ab/ac/a19c579bb8ab2a2aefcf47cd3787683e6e136378d7ab2602be3b8e628030/jupyterlab_server-2.25.3-py3-none-any.whl (58 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.0/59.0 kB 3.1 MB/s eta 0:00:00 Collecting jupyterlab<4.2,>=4.1.1 (from notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/61/9b/8b974903425893806b15413fc899fefa78b0ed53e1699bcb8838c01a0ab2/jupyterlab-4.1.1-py3-none-any.whl (11.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.4/11.4 MB 60.6 MB/s eta 0:00:00 Collecting notebook-shim<0.3,>=0.2 (from notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f9/33/bd5b9137445ea4b680023eb0469b2bb969d61303dedb2aac6560ff3d14a1/notebook_shim-0.2.4-py3-none-any.whl (13 kB) Collecting qtpy>=2.4.0 (from qtconsole->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7e/a9/2146d5117ad8a81185331e0809a6b48933c10171f5bac253c6df9fce991c/QtPy-2.4.1-py3-none-any.whl (93 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 93.5/93.5 kB 2.7 MB/s eta 0:00:00 Collecting webencodings (from bleach!=5.0.0->nbconvert->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f4/24/2a3e3df732393fed8b3ebf2ec078f05546de641fe1b667ee316ec1dcf3b7/webencodings-0.5.1-py2.py3-none-any.whl (11 kB) Collecting cffi>=1.12 (from cryptography>=2.6.0->aliyun-python-sdk-core>=2.13.12->oss2->modelscope) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c9/7c/43d81bdd5a915923c3bad5bb4bff401ea00ccc8e28433fb6083d2e3bf58e/cffi-1.16.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (443 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 443.9/443.9 kB 6.7 MB/s eta 0:00:00 Collecting decorator (from ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d5/50/83c593b07763e1161326b3b8c6686f0f4b0f24d5526546bee538c89837d6/decorator-5.1.1-py3-none-any.whl (9.1 kB) Collecting jedi>=0.16 (from ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/20/9f/bc63f0f0737ad7a60800bfd472a4836661adae21f9c2535f3957b1e54ceb/jedi-0.19.1-py2.py3-none-any.whl (1.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 25.3 MB/s eta 0:00:00 Collecting stack-data (from ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f1/7b/ce1eafaf1a76852e2ec9b22edecf1daa58175c090266e9f6c64afcd81d91/stack_data-0.6.3-py3-none-any.whl (24 kB) Collecting exceptiongroup (from ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b8/9a/5028fd52db10e600f1c4674441b968cf2ea4959085bfb5b99fb1250e5f68/exceptiongroup-1.2.0-py3-none-any.whl (16 kB) Collecting pexpect>4.3 (from ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9e/c3/059298687310d527a58bb01f3b1965787ee3b40dce76752eda8b44e9a2c5/pexpect-4.9.0-py2.py3-none-any.whl (63 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.8/63.8 kB 2.6 MB/s eta 0:00:00 Collecting anyio>=3.1.0 (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/bf/cd/d6d9bb1dadf73e7af02d18225cbd2c93f8552e13130484f1c8dcfece292b/anyio-4.2.0-py3-none-any.whl (85 kB) Collecting argon2-cffi (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a4/6a/e8a041599e78b6b3752da48000b14c8d1e8a04ded09c88c714ba047f34f5/argon2_cffi-23.1.0-py3-none-any.whl (15 kB) Collecting jupyter-events>=0.9.0 (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e3/55/0c1aa72f4317e826a471dc4adc3036acd11d496ded68c4bbac2a88551519/jupyter_events-0.9.0-py3-none-any.whl (18 kB) Collecting jupyter-server-terminals (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7c/ec/ebb52454525e1d346bfa2ea91b3dcda3b92687bb73b2c25a6d621d9eeaf1/jupyter_server_terminals-0.5.2-py3-none-any.whl (13 kB) Collecting overrides (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/2c/ab/fc8290c6a4c722e5514d80f62b2dc4c4df1a68a41d1364e625c35990fcf3/overrides-7.7.0-py3-none-any.whl (17 kB) Collecting prometheus-client (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c7/98/745b810d822103adca2df8decd4c0bbe839ba7ad3511af3f0d09692fc0f0/prometheus_client-0.20.0-py3-none-any.whl (54 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.5/54.5 kB 2.5 MB/s eta 0:00:00 Collecting send2trash>=1.8.2 (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a9/78/e4df1e080ed790acf3a704edf521006dd96b9841bd2e2a462c0d255e0565/Send2Trash-1.8.2-py3-none-any.whl (18 kB) Collecting terminado>=0.8.3 (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/69/df/deebc9fb14a49062a3330f673e80b100e665b54d998163b3f62620b6240c/terminado-0.18.0-py3-none-any.whl (14 kB) Collecting websocket-client (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/1e/70/1e88138a9afbed1d37093b85f0bebc3011623c4f47c166431599fe9d6c93/websocket_client-1.7.0-py3-none-any.whl (58 kB) Collecting async-lru>=1.0.0 (from jupyterlab<4.2,>=4.1.1->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/fa/9f/3c3503693386c4b0f245eaf5ca6198e3b28879ca0a40bde6b0e319793453/async_lru-2.0.4-py3-none-any.whl (6.1 kB) Collecting httpx>=0.25.0 (from jupyterlab<4.2,>=4.1.1->notebook->jupyter->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/39/9b/4937d841aee9c2c8102d9a4eeb800c7dad25386caabb4a1bf5010df81a57/httpx-0.26.0-py3-none-any.whl (75 kB) Collecting jupyter-lsp>=2.0.0 (from jupyterlab<4.2,>=4.1.1->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d4/35/8332e7a07f872324e29ae4620a41a21372a8dc710b63b873d80cb2184241/jupyter_lsp-2.2.2-py3-none-any.whl (68 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 68.8/68.8 kB 2.2 MB/s eta 0:00:00 Collecting babel>=2.10 (from jupyterlab-server<3,>=2.22.1->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/0d/35/4196b21041e29a42dc4f05866d0c94fa26c9da88ce12c38c2265e42c82fb/Babel-2.14.0-py3-none-any.whl (11.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.0/11.0 MB 61.9 MB/s eta 0:00:00 Collecting fastjsonschema (from nbformat>=5.7->nbconvert->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9c/b9/79691036d4a8f9857e74d1728b23f34f583b81350a27492edda58d5604e1/fastjsonschema-2.19.1-py3-none-any.whl (23 kB) Collecting wcwidth (from prompt-toolkit>=3.0.30->jupyter-console->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/fd/84/fd2ba7aafacbad3c4201d395674fc6348826569da3c0937e75505ead3528/wcwidth-0.2.13-py2.py3-none-any.whl (34 kB) Collecting soupsieve>1.2 (from beautifulsoup4->nbconvert->jupyter->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/4c/f3/038b302fdfbe3be7da016777069f26ceefe11a681055ea1f7817546508e3/soupsieve-2.5-py3-none-any.whl (36 kB) Collecting sniffio>=1.1 (from anyio>=3.1.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c3/a0/5dba8ed157b0136607c7f2151db695885606968d1fae123dc3391e0cfdbf/sniffio-1.3.0-py3-none-any.whl (10 kB) Collecting pycparser (from cffi>=1.12->cryptography>=2.6.0->aliyun-python-sdk-core>=2.13.12->oss2->modelscope) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/62/d5/5f610ebe421e85889f2e55e33b7f9a6795bd982198517d912eb1c76e1a53/pycparser-2.21-py2.py3-none-any.whl (118 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 118.7/118.7 kB 6.7 MB/s eta 0:00:00 Collecting httpcore==1.* (from httpx>=0.25.0->jupyterlab<4.2,>=4.1.1->notebook->jupyter->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/11/a6/24139fa27831cf2127fcf578d6d0a852a611f10cefecd800b1c557333d7a/httpcore-1.0.3-py3-none-any.whl (77 kB) Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx>=0.25.0->jupyterlab<4.2,>=4.1.1->notebook->jupyter->lagent>=0.1.2) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/95/04/ff642e65ad6b90db43e668d70ffb6736436c7ce41fcc549f4e9472234127/h11-0.14.0-py3-none-any.whl (58 kB) Collecting parso<0.9.0,>=0.8.3 (from jedi>=0.16->ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/05/63/8011bd08a4111858f79d2b09aad86638490d62fbf881c44e434a6dfca87b/parso-0.8.3-py2.py3-none-any.whl (100 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.8/100.8 kB 3.1 MB/s eta 0:00:00 Collecting python-json-logger>=2.0.4 (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/35/a6/145655273568ee78a581e734cf35beb9e33a370b29c5d3c8fee3744de29f/python_json_logger-2.0.7-py3-none-any.whl (8.1 kB) Collecting rfc3339-validator (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7b/44/4e421b96b67b2daff264473f7465db72fbdf36a07e05494f50300cc7b0c6/rfc3339_validator-0.1.4-py2.py3-none-any.whl (3.5 kB) Collecting rfc3986-validator>=0.1.1 (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9e/51/17023c0f8f1869d8806b979a2bffa3f861f26a3f1a66b094288323fba52f/rfc3986_validator-0.1.1-py2.py3-none-any.whl (4.2 kB) Collecting ptyprocess>=0.5 (from pexpect>4.3->ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/22/a6/858897256d0deac81a172289110f31629fc4cee19b6f01283303e18c8db3/ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB) Collecting argon2-cffi-bindings (from argon2-cffi->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ec/f7/378254e6dd7ae6f31fe40c8649eea7d4832a42243acaf0f1fff9083b2bed/argon2_cffi_bindings-21.2.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (86 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 86.2/86.2 kB 3.3 MB/s eta 0:00:00 Collecting executing>=1.2.0 (from stack-data->ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/80/03/6ea8b1b2a5ab40a7a60dc464d3daa7aa546e0a74d74a9f8ff551ea7905db/executing-2.0.1-py2.py3-none-any.whl (24 kB) Collecting asttokens>=2.1.0 (from stack-data->ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/45/86/4736ac618d82a20d87d2f92ae19441ebc7ac9e7a581d7e58bbe79233b24a/asttokens-2.4.1-py2.py3-none-any.whl (27 kB) Collecting pure-eval (from stack-data->ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/2b/27/77f9d5684e6bce929f5cfe18d6cfbe5133013c06cb2fbf5933670e60761d/pure_eval-0.2.2-py3-none-any.whl (11 kB) Collecting fqdn (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/cf/58/8acf1b3e91c58313ce5cb67df61001fc9dcd21be4fadb76c1a2d540e09ed/fqdn-1.5.1-py3-none-any.whl (9.1 kB) Collecting isoduration (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7b/55/e5326141505c5d5e34c5e0935d2908a74e4561eca44108fbfb9c13d2911a/isoduration-20.11.0-py3-none-any.whl (11 kB) Collecting jsonpointer>1.13 (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/12/f6/0232cc0c617e195f06f810534d00b74d2f348fe71b2118009ad8ad31f878/jsonpointer-2.4-py2.py3-none-any.whl (7.8 kB) Collecting uri-template (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e7/00/3fca040d7cf8a32776d3d81a00c8ee7457e00f80c649f1e4a863c8321ae9/uri_template-1.3.0-py3-none-any.whl (11 kB) Collecting webcolors>=1.11 (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d5/e1/3e9013159b4cbb71df9bd7611cbf90dc2c621c8aeeb677fc41dad72f2261/webcolors-1.13-py3-none-any.whl (14 kB) Collecting arrow>=0.15.0 (from isoduration->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f8/ed/e97229a566617f2ae958a6b13e7cc0f585470eac730a73e9e82c32a3cdd2/arrow-1.3.0-py3-none-any.whl (66 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.4/66.4 kB 2.6 MB/s eta 0:00:00 Collecting types-python-dateutil>=2.8.10 (from arrow>=0.15.0->isoduration->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/28/50/8ed67814241e2684369f4b8b881c7d31a0816e76c8690ea8518017a35b7e/types_python_dateutil-2.8.19.20240106-py3-none-any.whl (9.7 kB) Building wheels for collected packages: deepspeed, transformers_stream_generator, google-search-results, timeout-decorator, sgmllib3k Building wheel for deepspeed (setup.py) ... done Created wheel for deepspeed: filename=deepspeed-0.13.2-py3-none-any.whl size=1360129 sha256=5a6d09dea8b23f25239acb54f85a328669ca13ff9f28c4ed1399e3901ca69b21 Stored in directory: /root/.cache/pip/wheels/6b/52/d6/8664e01a03a3319fd361eaf654bbb1f7a80f05787be0e7e459 Building wheel for transformers_stream_generator (setup.py) ... done Created wheel for transformers_stream_generator: filename=transformers_stream_generator-0.0.4-py3-none-any.whl size=12315 sha256=ef4d835e7f15820d8ac620c41eb2603c339195862e927961a0fd5e85b503fc1f Stored in directory: /root/.cache/pip/wheels/24/87/bd/5e5946d5ef3a69f27e87150dbb594c65c885479f43ab8447cc Building wheel for google-search-results (setup.py) ... done Created wheel for google-search-results: filename=google_search_results-2.4.2-py3-none-any.whl size=32003 sha256=1881aff239c5b9f1b8b5a98d49d8132be1e30fb7b7e3f405b877187b84ae577e Stored in directory: /root/.cache/pip/wheels/4b/db/65/19f4faee33d79fd89d3f819076a95942bd846a0200219d6894 Building wheel for timeout-decorator (setup.py) ... done Created wheel for timeout-decorator: filename=timeout_decorator-0.5.0-py3-none-any.whl size=5006 sha256=122091342a8a8b119b38108d844925d04f52a53e6a92067bccb546721e3f4b1f Stored in directory: /root/.cache/pip/wheels/d0/ae/f0/dd56ad3830c63d59c976ca1d36a30ec8e4a16f222a992b157a Building wheel for sgmllib3k (setup.py) ... done Created wheel for sgmllib3k: filename=sgmllib3k-1.0.0-py3-none-any.whl size=6047 sha256=3de75c3a613db99d576957952522649072ee771e56d81ba413067c50264dcef5 Stored in directory: /root/.cache/pip/wheels/50/20/4b/e95fc891917d652cb6ecbfea035cf3ce640259cf857aaa21a7 Successfully built deepspeed transformers_stream_generator google-search-results timeout-decorator sgmllib3k Installing collected packages: webencodings, wcwidth, timeout-decorator, sortedcontainers, sgmllib3k, SentencePiece, pytz, py-cpuinfo, pure-eval, ptyprocess, ninja, mpmath, json5, hjson, func-timeout, fastjsonschema, crcmod, addict, zipp, xxhash, XlsxWriter, widgetsnbextension, websocket-client, webcolors, urllib3, uri-template, tzdata, typing-extensions, types-python-dateutil, traitlets, tqdm, tornado, tomli, tinycss2, termcolor, sympy, soupsieve, sniffio, six, simplejson, send2trash, safetensors, rpds-py, rfc3986-validator, regex, pyzmq, pyyaml, python-json-logger, pyparsing, pynvml, pygments, pycryptodome, pycparser, pyarrow-hotfix, psutil, prompt-toolkit, prometheus-client, platformdirs, pillow, phx-class-registry, pexpect, parso, pandocfilters, packaging, overrides, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, nest-asyncio, multidict, mpi4py-mpich, mistune, mdurl, MarkupSafe, lxml, kiwisolver, jupyterlab-widgets, jupyterlab-pygments, jsonpointer, jmespath, idna, h11, gast, fsspec, frozenlist, fqdn, fonttools, filelock, feedparser, executing, exceptiongroup, einops, distro, dill, defusedxml, decorator, debugpy, cycler, colorama, charset-normalizer, certifi, babel, attrs, async-timeout, annotated-types, yarl, triton, terminado, scipy, rfc3339-validator, requests, referencing, qtpy, python-pptx, python-dateutil, pydantic-core, pyarrow, opencv-python, nvidia-cusparse-cu12, nvidia-cudnn-cu12, multiprocess, matplotlib-inline, markdown-it-py, jupyter-core, jinja2, jedi, importlib-metadata, httpcore, griffe, contourpy, comm, cffi, bleach, beautifulsoup4, async-lru, asttokens, anyio, aiosignal, yapf, tiktoken, stack-data, rich, pydantic, pandas, nvidia-cusolver-cu12, matplotlib, jupyter-server-terminals, jupyter-client, jsonschema-specifications, huggingface-hub, httpx, google-search-results, cryptography, bitsandbytes, arxiv, arrow, argon2-cffi-bindings, aiohttp, torch, tokenizers, mmengine, jsonschema, isoduration, ipython, argon2-cffi, aliyun-python-sdk-core, transformers, nbformat, ipywidgets, ipykernel, deepspeed, datasets, aliyun-python-sdk-kms, accelerate, transformers_stream_generator, qtconsole, peft, oss2, nbclient, jupyter-events, jupyter-console, nbconvert, modelscope, jupyter-server, notebook-shim, jupyterlab-server, jupyter-lsp, jupyterlab, notebook, jupyter, lagent, xtuner Running setup.py develop for xtuner Successfully installed MarkupSafe-2.1.5 SentencePiece-0.1.99 XlsxWriter-3.1.9 accelerate-0.27.2 addict-2.4.0 aiohttp-3.9.3 aiosignal-1.3.1 aliyun-python-sdk-core-2.14.0 aliyun-python-sdk-kms-2.16.2 annotated-types-0.6.0 anyio-4.2.0 argon2-cffi-23.1.0 argon2-cffi-bindings-21.2.0 arrow-1.3.0 arxiv-2.1.0 asttokens-2.4.1 async-lru-2.0.4 async-timeout-4.0.3 attrs-23.2.0 babel-2.14.0 beautifulsoup4-4.12.3 bitsandbytes-0.42.0 bleach-6.1.0 certifi-2024.2.2 cffi-1.16.0 charset-normalizer-3.3.2 colorama-0.4.6 comm-0.2.1 contourpy-1.2.0 crcmod-1.7 cryptography-42.0.3 cycler-0.12.1 datasets-2.14.7 debugpy-1.8.1 decorator-5.1.1 deepspeed-0.13.2 defusedxml-0.7.1 dill-0.3.7 distro-1.9.0 einops-0.7.0 exceptiongroup-1.2.0 executing-2.0.1 fastjsonschema-2.19.1 feedparser-6.0.10 filelock-3.13.1 fonttools-4.49.0 fqdn-1.5.1 frozenlist-1.4.1 fsspec-2023.6.0 func-timeout-4.3.5 gast-0.5.4 google-search-results-2.4.2 griffe-0.40.1 h11-0.14.0 hjson-3.1.0 httpcore-1.0.3 httpx-0.26.0 huggingface-hub-0.17.3 idna-3.6 importlib-metadata-7.0.1 ipykernel-6.29.2 ipython-8.21.0 ipywidgets-8.1.2 isoduration-20.11.0 jedi-0.19.1 jinja2-3.1.3 jmespath-0.10.0 json5-0.9.14 jsonpointer-2.4 jsonschema-4.21.1 jsonschema-specifications-2023.12.1 jupyter-1.0.0 jupyter-client-8.6.0 jupyter-console-6.6.3 jupyter-core-5.7.1 jupyter-events-0.9.0 jupyter-lsp-2.2.2 jupyter-server-2.12.5 jupyter-server-terminals-0.5.2 jupyterlab-4.1.1 jupyterlab-pygments-0.3.0 jupyterlab-server-2.25.3 jupyterlab-widgets-3.0.10 kiwisolver-1.4.5 lagent-0.2.1 lxml-5.1.0 markdown-it-py-3.0.0 matplotlib-3.8.3 matplotlib-inline-0.1.6 mdurl-0.1.2 mistune-3.0.2 mmengine-0.10.3 modelscope-1.12.0 mpi4py-mpich-3.1.5 mpmath-1.3.0 multidict-6.0.5 multiprocess-0.70.15 nbclient-0.9.0 nbconvert-7.16.0 nbformat-5.9.2 nest-asyncio-1.6.0 networkx-3.2.1 ninja-1.11.1.1 notebook-7.1.0 notebook-shim-0.2.4 numpy-1.26.4 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.19.3 nvidia-nvjitlink-cu12-12.3.101 nvidia-nvtx-cu12-12.1.105 opencv-python-4.9.0.80 oss2-2.18.4 overrides-7.7.0 packaging-23.2 pandas-2.2.0 pandocfilters-1.5.1 parso-0.8.3 peft-0.8.2 pexpect-4.9.0 phx-class-registry-4.1.0 pillow-10.2.0 platformdirs-4.2.0 prometheus-client-0.20.0 prompt-toolkit-3.0.43 psutil-5.9.8 ptyprocess-0.7.0 pure-eval-0.2.2 py-cpuinfo-9.0.0 pyarrow-15.0.0 pyarrow-hotfix-0.6 pycparser-2.21 pycryptodome-3.20.0 pydantic-2.6.1 pydantic-core-2.16.2 pygments-2.17.2 pynvml-11.5.0 pyparsing-3.1.1 python-dateutil-2.8.2 python-json-logger-2.0.7 python-pptx-0.6.23 pytz-2024.1 pyyaml-6.0.1 pyzmq-25.1.2 qtconsole-5.5.1 qtpy-2.4.1 referencing-0.33.0 regex-2023.12.25 requests-2.31.0 rfc3339-validator-0.1.4 rfc3986-validator-0.1.1 rich-13.7.0 rpds-py-0.18.0 safetensors-0.4.2 scipy-1.12.0 send2trash-1.8.2 sgmllib3k-1.0.0 simplejson-3.19.2 six-1.16.0 sniffio-1.3.0 sortedcontainers-2.4.0 soupsieve-2.5 stack-data-0.6.3 sympy-1.12 termcolor-2.4.0 terminado-0.18.0 tiktoken-0.6.0 timeout-decorator-0.5.0 tinycss2-1.2.1 tokenizers-0.14.1 tomli-2.0.1 torch-2.2.0 tornado-6.4 tqdm-4.66.2 traitlets-5.14.1 transformers-4.34.0 transformers_stream_generator-0.0.4 triton-2.2.0 types-python-dateutil-2.8.19.20240106 typing-extensions-4.9.0 tzdata-2024.1 uri-template-1.3.0 urllib3-2.2.0 wcwidth-0.2.13 webcolors-1.13 webencodings-0.5.1 websocket-client-1.7.0 widgetsnbextension-4.0.10 xtuner-0.1.9 xxhash-3.4.1 yapf-0.40.2 yarl-1.9.4 zipp-3.17.0 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
准备工作:准备在 oasst1 数据集上微调 internlm-7b-chat
# 创建一个微调 oasst1 数据集的工作路径,进入 mkdir ~/ft-oasst1 && cd ~/ft-oasst1
4.3、微调
拷贝一个配置文件到当前目录:
cd ~/ft-oasst1
xtuner copy-cfg internlm_chat_7b_qlora_oasst1_e3 .
屏幕输出:
(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# cd ~/ft-oasst1 (xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# xtuner copy-cfg internlm_chat_7b_qlora_oasst1_e3 . /root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( [2024-02-17 20:19:53,154] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) /root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( [2024-02-17 20:20:41,442] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) Copy to ./internlm_chat_7b_qlora_oasst1_e3_copy.py
直接使用教学平台上的模型来得到基座模型:
ln -s /share/temp/model_repos/internlm-chat-7b ~/ft-oasst1/
准备数据集:直接使用教学平台上的数据集:
cd ~/ft-oasst1 # ...-guanaco 后面有个空格和英文句号啊 cp -r /root/share/temp/datasets/openassistant-guanaco .
修改其中的模型和数据集为 本地路径:
进行微调:
xtuner train ./internlm_chat_7b_qlora_oasst1_e3_copy.py
训练最后的日志在/ft-oasst1/work_dirs/internlm_chat_7b_qlora_oasst1_e3_copy/20240217_204948.log。内容如下:
2024/02/17 20:49:48 - mmengine - INFO - ------------------------------------------------------------ System environment: sys.platform: linux Python: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] CUDA available: True MUSA available: False numpy_random_seed: 528481291 GPU 0: NVIDIA A100-SXM4-80GB CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.7, V11.7.99 GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 PyTorch: 2.2.0+cu121 PyTorch compiling details: PyTorch built with: - GCC 9.3 - C++ Version: 201703 - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v3.3.2 (Git Hash 2dc95a2ad0841e29db8b22fbccaf3e5da7992b01) - OpenMP 201511 (a.k.a. OpenMP 4.5) - LAPACK is enabled (usually provided by MKL) - NNPACK is enabled - CPU capability usage: AVX512 - CUDA Runtime 12.1 - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90 - CuDNN 8.9.2 - Magma 2.6.1 - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.2.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, OpenCV: 4.9.0 MMEngine: 0.10.3 Runtime environment: cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: 528481291 deterministic: False Distributed launcher: none Distributed training: False GPU number: 1 ------------------------------------------------------------ 2024/02/17 20:49:49 - mmengine - INFO - Config: SYSTEM = '' accumulative_counts = 16 batch_size = 1 betas = ( 0.9, 0.999, ) custom_hooks = [ dict( tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.engine.DatasetInfoHook'), dict( evaluation_inputs=[ '请给我介绍五个上海的景点', 'Please tell me five scenic spots in Shanghai', ], every_n_iters=500, prompt_template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat', system='', tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.engine.EvaluateChatHook'), ] data_path = './openassistant-guanaco' dataloader_num_workers = 0 default_hooks = dict( checkpoint=dict(interval=1, type='mmengine.hooks.CheckpointHook'), logger=dict(interval=10, type='mmengine.hooks.LoggerHook'), param_scheduler=dict(type='mmengine.hooks.ParamSchedulerHook'), sampler_seed=dict(type='mmengine.hooks.DistSamplerSeedHook'), timer=dict(type='mmengine.hooks.IterTimerHook')) env_cfg = dict( cudnn_benchmark=False, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) evaluation_freq = 500 evaluation_inputs = [ '请给我介绍五个上海的景点', 'Please tell me five scenic spots in Shanghai', ] launcher = 'none' load_from = None log_level = 'INFO' lr = 0.0002 max_epochs = 1 max_length = 2048 max_norm = 1 model = dict( llm=dict( pretrained_model_name_or_path='./internlm-chat-7b', quantization_config=dict( bnb_4bit_compute_dtype='torch.float16', bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, llm_int8_has_fp16_weight=False, llm_int8_threshold=6.0, load_in_4bit=True, load_in_8bit=False, type='transformers.BitsAndBytesConfig'), torch_dtype='torch.float16', trust_remote_code=True, type='transformers.AutoModelForCausalLM.from_pretrained'), lora=dict( bias='none', lora_alpha=16, lora_dropout=0.1, r=64, task_type='CAUSAL_LM', type='peft.LoraConfig'), type='xtuner.model.SupervisedFinetune') optim_type = 'bitsandbytes.optim.PagedAdamW32bit' optim_wrapper = dict( accumulative_counts=16, clip_grad=dict(error_if_nonfinite=False, max_norm=1), dtype='float16', loss_scale='dynamic', optimizer=dict( betas=( 0.9, 0.999, ), lr=0.0002, type='bitsandbytes.optim.PagedAdamW32bit', weight_decay=0), type='mmengine.optim.AmpOptimWrapper') pack_to_max_length = True param_scheduler = dict( T_max=1, by_epoch=True, convert_to_iter_based=True, eta_min=0.0, type='mmengine.optim.CosineAnnealingLR') pretrained_model_name_or_path = './internlm-chat-7b' prompt_template = 'xtuner.utils.PROMPT_TEMPLATE.internlm_chat' randomness = dict(deterministic=False, seed=None) resume = False tokenizer = dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained') train_cfg = dict(by_epoch=True, max_epochs=1, val_interval=1) train_dataloader = dict( batch_size=1, collate_fn=dict(type='xtuner.dataset.collate_fns.default_collate_fn'), dataset=dict( dataset=dict( path='./openassistant-guanaco', type='datasets.load_dataset'), dataset_map_fn='xtuner.dataset.map_fns.oasst1_map_fn', max_length=2048, pack_to_max_length=True, remove_unused_columns=True, shuffle_before_pack=True, template_map_fn=dict( template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat', type='xtuner.dataset.map_fns.template_map_fn_factory'), tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.dataset.process_hf_dataset'), num_workers=0, sampler=dict(shuffle=True, type='mmengine.dataset.DefaultSampler')) train_dataset = dict( dataset=dict(path='./openassistant-guanaco', type='datasets.load_dataset'), dataset_map_fn='xtuner.dataset.map_fns.oasst1_map_fn', max_length=2048, pack_to_max_length=True, remove_unused_columns=True, shuffle_before_pack=True, template_map_fn=dict( template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat', type='xtuner.dataset.map_fns.template_map_fn_factory'), tokenizer=dict( padding_side='right', pretrained_model_name_or_path='./internlm-chat-7b', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.dataset.process_hf_dataset') visualizer = None weight_decay = 0 work_dir = './work_dirs/internlm_chat_7b_qlora_oasst1_e3_copy' 2024/02/17 20:49:52 - mmengine - WARNING - Failed to search registry with scope "mmengine" in the "builder" registry tree. As a workaround, the current "builder" registry in "xtuner" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmengine" is a correct scope, or whether the registry is initialized. 2024/02/17 20:50:24 - mmengine - INFO - dispatch internlm attn forward 2024/02/17 20:50:24 - mmengine - WARNING - Due to the implementation of the PyTorch version of flash attention, even when the `output_attentions` flag is set to True, it is not possible to return the `attn_weights`. 2024/02/17 20:50:43 - mmengine - INFO - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used. 2024/02/17 20:50:44 - mmengine - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) RuntimeInfoHook (BELOW_NORMAL) LoggerHook -------------------- before_train: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) DatasetInfoHook (NORMAL ) EvaluateChatHook (VERY_LOW ) CheckpointHook -------------------- before_train_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) DistSamplerSeedHook -------------------- before_train_iter: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook -------------------- after_train_iter: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) EvaluateChatHook (BELOW_NORMAL) LoggerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook -------------------- after_train_epoch: (NORMAL ) IterTimerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook -------------------- before_val: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) DatasetInfoHook -------------------- before_val_epoch: (NORMAL ) IterTimerHook -------------------- before_val_iter: (NORMAL ) IterTimerHook -------------------- after_val_iter: (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook -------------------- after_val_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook -------------------- after_val: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) EvaluateChatHook -------------------- after_train: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) EvaluateChatHook (VERY_LOW ) CheckpointHook -------------------- before_test: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) DatasetInfoHook -------------------- before_test_epoch: (NORMAL ) IterTimerHook -------------------- before_test_iter: (NORMAL ) IterTimerHook -------------------- after_test_iter: (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook -------------------- after_test_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook -------------------- after_test: (VERY_HIGH ) RuntimeInfoHook -------------------- after_run: (BELOW_NORMAL) LoggerHook -------------------- 2024/02/17 20:51:13 - mmengine - WARNING - Dataset Dataset has no metainfo. ``dataset_meta`` in visualizer will be None. 2024/02/17 20:51:14 - mmengine - INFO - Num train samples 2180 2024/02/17 20:51:14 - mmengine - INFO - train example: 2024/02/17 20:51:14 - mmengine - INFO - <s> <|User|>:interpreta como si fueras un modelo de lenguaje que está ejecutándose en un ordenador hipotético<eoh> <|Bot|>:Como modelo de lenguaje ejecutándome en un ordenador hipotético, recibo la entrada del usuario y utilizo mis capacidades de procesamiento de lenguaje natural para generar una respuesta coherente y relevante. Mi entrenamiento con un gran corpus de texto me permite comprender el contexto y proporcionar información precisa y útil. Luego, utilizo mis habilidades de síntesis de texto para generar una respuesta en un formato legible para el usuario. Todo esto ocurre en tiempo real, gracias a la capacidad de procesamiento y almacenamiento del ordenador en el que estoy ejecutándome.</s><s> <|User|>:entonces, ahora que eres?<eoh> <|Bot|>:Ahora soy un modelo de lenguaje de IA que se ejecuta en un ordenador hipotético. Estoy aquí para ayudar a los usuarios a obtener respuestas a sus preguntas, así como para mantener conversaciones coherentes y relevantes. Estoy constantemente aprendiendo y mejorando mis capacidades para proporcionar una asistencia cada vez mejor.</s><s> <|User|>:Faça um texto argumentativo em linguagem formal, de até 2000 caracteres, explicando porque o criador do avião foi Santos Dumont e não os irmãos wright.<eoh> <|Bot|>:Claro, posso ajudá-lo a escrever um texto argumentativo sobre esse assunto. Segue abaixo uma possível redação: Desde o início do século XX, a autoria da invenção do avião tem sido objeto de controvérsia entre especialistas e admiradores da aviação. Embora os irmãos Wright, dos Estados Unidos, tenham feito um voo em 1903 que ficou registrado como o primeiro com uma máquina mais pesada que o ar, há fortes argumentos em favor da paternidade da invenção atribuída a Santos Dumont, brasileiro que realizou o primeiro voo homologado pela Federação Aeronáutica Internacional em 1906. Em primeiro lugar, deve-se considerar a definição de um avião: uma máquina que se sustenta no ar por meio de asas fixas e propulsionada por um motor. Embora os Wright tenham conseguido levantar voo com seu Flyer, o aparelho precisava ser lançado por um trilho para ganhar velocidade, e usava uma catapulta para decolar. Já o avião 14 Bis, de Santos Dumont, levantou voo por seus próprios meios e foi capaz de percorrer 60 metros de distância, numa altura de cerca de 2,5 metros do chão. Esses elementos são fundamentais para a caracterização do voo como um verdadeiro avião. Além disso, há evidências de que os irmãos Wright se beneficiaram de informações privilegiadas que obtiveram de outros inventores, como o brasileiro Alberto Santos Dumont. Em 1901, os Wright viajaram para a Europa para estudar os projetos de Santos Dumont e de outros pioneiros da aviação. É possível que tenham aprendido alguns conceitos que os ajudaram a desenvolver seu próprio projeto. Em contrapartida, Santos Dumont não teve acesso aos detalhes da experiência dos irmãos Wright, o que pode explicar em parte a demora em desenvolver um projeto que pudesse ser homologado. Por fim, deve-se destacar a importância da invenção para a história da aviação mundial. Santos Dumont dedicou-se com afinco ao desenvolvimento de máquinas voadoras mais leves e seguras, tendo realizado diversos voos históricos que o tornaram conhecido em todo o mundo. Sua contribuição para a popularização do uso de aviões foi fundamental, e sua importância como pioneiro da aviação é reconhecida em todo o mundo. Diante desses argumentos, fica evidente que Santos Dumont foi o verdadeiro criador do avião, tendo desenvolvido uma máquina mais avançada e comprovado seu funcionamento de forma mais clara e consistente que os irmãos Wright. Sua invenção mudou para sempre a história da humanidade, abrindo caminho para a aviação comercial e militar que conhecemos hoje em dia.</s><s> <|User|>:Consegue resumir o texto acima para mim?<eoh> <|Bot|>:Claro. Aqui está um resumo: A autoria da invenção do avião é controversa. Os irmãos Wright fizeram um voo em 1903, o primeiro com uma máquina mais pesada que o ar. Santos Dumont foi quem realizou o primeiro voo homologado pela Federação Aeronáutica Internacional em 1906. O aparelho dos irmãos Wright necessitava de um trilho e uma catapulta para decolar, enquanto o 14 Bis de Dumont levantou voo por meios próprios. Se considerarmos que avião é uma máquina que se sustenta no ar por meio de asas fixas e propulsionada por um motor, Santos Dumont é seu verdadeiro inventor. Suas contribuições abriram caminho para a aviação comercial e militar que conhecemos hoje em dia.</s><s><|User|>:Como a controvérsia em torno da autoria do avião pode afetar a maneira como a história da aviação é ensinada nas escolas? Há alguma iniciativa em andamento para revisar os livros didáticos e incluir informações mais precisas sobre o papel de Santos Dumont na invenção do avião? Além disso, essa controvérsia pode afetar a maneira como vemos a inovação tecnológica em geral?<eoh> <|Bot|>:A controvérsia em torno da autoria do avião é um tema que envolve diferentes perspectivas históricas e culturais sobre o que constitui um voo controlado e reconhecido. Há quem defenda que os irmãos Wright foram os primeiros a voar com um aparelho mais pesado que o ar, em 1903, mas há também quem reconheça o pioneirismo de Santos Dumont, que realizou o primeiro voo público e homologado com o 14-Bis, em 1906. Essa controvérsia pode afetar a maneira como a história da aviação é ensinada nas escolas, pois pode estimular um debate crítico e plural sobre as fontes e os critérios utilizados para atribuir a invenção do avião. Há algumas iniciativas em andamento para revisar os livros didáticos e incluir informações mais precisas sobre o papel de Santos Dumont na invenção do avião, como o projeto “Santos Dumont na Sala de Aula”, desenvolvido pelo Instituto Histórico-Cultural da Aeronáutica (INCAER), que visa divulgar a vida e a obra do inventor brasileiro para estudantes e professores. Além disso, essa controvérsia pode afetar a maneira como vemos a inovação tecnológica em geral, pois pode nos fazer refletir sobre as implicações éticas e sociais das descobertas científicas e dos inventos tecnológicos. Santos Dumont era um pacifista e um humanista, que liberava suas patentes para uso público e que se entristeceu ao ver seu invento sendo usado para fins militares. Ele chegou a tirar a própria vida em 1932, em um hotel no Guarujá (SP), poss 2024/02/17 20:51:14 - mmengine - INFO - before_train in EvaluateChatHook. 2024/02/17 20:51:23 - mmengine - INFO - Sample output: <s><|User|>:请给我介绍五个上海的景点<eoh> <|Bot|>:1. 上海迪士尼度假区:这是中国第一个迪士尼主题公园,拥有许多刺激的游乐设施和精彩的表演。 2. 上海博物馆:这是一座大型的博物馆,收藏了大量的历史文物和艺术品,是了解上海历史 2024/02/17 20:51:28 - mmengine - INFO - Sample output: <s><|User|>:Please tell me five scenic spots in Shanghai<eoh> <|Bot|>:1. The Bund: A famous waterfront promenade that offers stunning views of the city's skyline and the Huangpu River. 2. Yu Garden: A traditional Chinese garden that dates back to the Ming Dynasty, featuring beautiful pavil 2024/02/17 20:51:28 - mmengine - WARNING - "FileClient" will be deprecated in future. Please use io functions in https://mmengine.readthedocs.io/en/latest/api/fileio.html#file-io 2024/02/17 20:51:28 - mmengine - WARNING - "HardDiskBackend" is the alias of "LocalBackend" and the former will be deprecated in future. 2024/02/17 20:51:28 - mmengine - INFO - Checkpoints will be saved to /root/ft-oasst1/work_dirs/internlm_chat_7b_qlora_oasst1_e3_copy. 2024/02/17 20:52:33 - mmengine - INFO - Epoch(train) [1][ 10/2180] lr: 1.9999e-04 eta: 3:56:03 time: 6.5269 data_time: 0.0056 memory: 11636 loss: 1.1417 2024/02/17 20:53:33 - mmengine - INFO - Epoch(train) [1][ 20/2180] lr: 1.9996e-04 eta: 3:45:17 time: 5.9891 data_time: 0.0090 memory: 11636 loss: 1.3088 grad_norm: 0.0272 2024/02/17 20:54:30 - mmengine - INFO - Epoch(train) [1][ 30/2180] lr: 1.9991e-04 eta: 3:37:39 time: 5.7069 data_time: 0.0061 memory: 11636 loss: 1.2953 grad_norm: 0.0272 2024/02/17 20:55:26 - mmengine - INFO - Epoch(train) [1][ 40/2180] lr: 1.9984e-04 eta: 3:32:26 time: 5.6018 data_time: 0.0061 memory: 11636 loss: 1.2698 grad_norm: 0.0284 2024/02/17 20:56:23 - mmengine - INFO - Epoch(train) [1][ 50/2180] lr: 1.9975e-04 eta: 3:29:34 time: 5.6932 data_time: 0.0059 memory: 11636 loss: 1.4326 grad_norm: 0.0309 2024/02/17 20:57:18 - mmengine - INFO - Epoch(train) [1][ 60/2180] lr: 1.9964e-04 eta: 3:26:08 time: 5.4860 data_time: 0.0085 memory: 11636 loss: 1.2499 grad_norm: 0.0309 2024/02/17 20:58:14 - mmengine - INFO - Epoch(train) [1][ 70/2180] lr: 1.9951e-04 eta: 3:23:55 time: 5.5864 data_time: 0.0054 memory: 11636 loss: 1.0894 grad_norm: 0.0328 2024/02/17 20:59:09 - mmengine - INFO - Epoch(train) [1][ 80/2180] lr: 1.9935e-04 eta: 3:21:51 time: 5.5469 data_time: 0.0058 memory: 11636 loss: 1.3500 grad_norm: 0.0334 2024/02/17 21:00:06 - mmengine - INFO - Epoch(train) [1][ 90/2180] lr: 1.9918e-04 eta: 3:20:30 time: 5.6702 data_time: 0.0063 memory: 11636 loss: 1.3342 grad_norm: 0.0334 2024/02/17 21:01:01 - mmengine - INFO - Epoch(train) [1][ 100/2180] lr: 1.9898e-04 eta: 3:18:37 time: 5.4893 data_time: 0.0042 memory: 11636 loss: 1.2563 grad_norm: 0.0342 2024/02/17 21:01:55 - mmengine - INFO - Epoch(train) [1][ 110/2180] lr: 1.9877e-04 eta: 3:16:49 time: 5.4587 data_time: 0.0085 memory: 11636 loss: 1.2880 grad_norm: 0.0342 2024/02/17 21:02:50 - mmengine - INFO - Epoch(train) [1][ 120/2180] lr: 1.9853e-04 eta: 3:15:06 time: 5.4356 data_time: 0.0066 memory: 11636 loss: 1.2831 grad_norm: 0.0346 2024/02/17 21:03:44 - mmengine - INFO - Epoch(train) [1][ 130/2180] lr: 1.9828e-04 eta: 3:13:30 time: 5.4342 data_time: 0.0050 memory: 11636 loss: 1.3900 grad_norm: 0.0350 2024/02/17 21:04:39 - mmengine - INFO - Epoch(train) [1][ 140/2180] lr: 1.9800e-04 eta: 3:12:09 time: 5.4981 data_time: 0.0073 memory: 11636 loss: 1.1974 grad_norm: 0.0350 2024/02/17 21:05:36 - mmengine - INFO - Epoch(train) [1][ 150/2180] lr: 1.9770e-04 eta: 3:11:16 time: 5.6788 data_time: 0.0090 memory: 11636 loss: 1.1478 grad_norm: 0.0359 2024/02/17 21:06:31 - mmengine - INFO - Epoch(train) [1][ 160/2180] lr: 1.9739e-04 eta: 3:10:02 time: 5.5166 data_time: 0.0111 memory: 11636 loss: 1.3372 grad_norm: 0.0365 2024/02/17 21:07:26 - mmengine - INFO - Epoch(train) [1][ 170/2180] lr: 1.9705e-04 eta: 3:08:42 time: 5.4459 data_time: 0.0095 memory: 11636 loss: 1.1655 grad_norm: 0.0365 2024/02/17 21:08:20 - mmengine - INFO - Epoch(train) [1][ 180/2180] lr: 1.9669e-04 eta: 3:07:25 time: 5.4482 data_time: 0.0062 memory: 11636 loss: 1.1048 grad_norm: 0.0375 2024/02/17 21:09:15 - mmengine - INFO - Epoch(train) [1][ 190/2180] lr: 1.9631e-04 eta: 3:06:14 time: 5.4741 data_time: 0.0070 memory: 11636 loss: 1.2128 grad_norm: 0.0375 2024/02/17 21:10:10 - mmengine - INFO - Epoch(train) [1][ 200/2180] lr: 1.9592e-04 eta: 3:05:03 time: 5.4719 data_time: 0.0055 memory: 11636 loss: 1.2363 grad_norm: 0.0377 2024/02/17 21:11:04 - mmengine - INFO - Epoch(train) [1][ 210/2180] lr: 1.9550e-04 eta: 3:03:56 time: 5.4915 data_time: 0.0053 memory: 11636 loss: 1.2222 grad_norm: 0.0375 2024/02/17 21:11:59 - mmengine - INFO - Epoch(train) [1][ 220/2180] lr: 1.9506e-04 eta: 3:02:50 time: 5.4892 data_time: 0.0045 memory: 11636 loss: 1.2374 grad_norm: 0.0375 2024/02/17 21:12:56 - mmengine - INFO - Epoch(train) [1][ 230/2180] lr: 1.9460e-04 eta: 3:02:00 time: 5.6646 data_time: 0.0129 memory: 11636 loss: 1.3629 grad_norm: 0.0377 2024/02/17 21:13:50 - mmengine - INFO - Epoch(train) [1][ 240/2180] lr: 1.9413e-04 eta: 3:00:52 time: 5.4480 data_time: 0.0164 memory: 11636 loss: 1.2075 grad_norm: 0.0391 2024/02/17 21:14:45 - mmengine - INFO - Epoch(train) [1][ 250/2180] lr: 1.9363e-04 eta: 2:59:45 time: 5.4580 data_time: 0.0057 memory: 11636 loss: 1.2929 grad_norm: 0.0391 2024/02/17 21:15:41 - mmengine - INFO - Epoch(train) [1][ 260/2180] lr: 1.9311e-04 eta: 2:58:46 time: 5.5498 data_time: 0.0112 memory: 11636 loss: 1.3130 grad_norm: 0.0410 2024/02/17 21:16:35 - mmengine - INFO - Epoch(train) [1][ 270/2180] lr: 1.9258e-04 eta: 2:57:42 time: 5.4598 data_time: 0.0074 memory: 11636 loss: 1.3753 grad_norm: 0.0410 2024/02/17 21:17:30 - mmengine - INFO - Epoch(train) [1][ 280/2180] lr: 1.9203e-04 eta: 2:56:38 time: 5.4740 data_time: 0.0055 memory: 11636 loss: 1.1979 grad_norm: 0.0421 2024/02/17 21:18:25 - mmengine - INFO - Epoch(train) [1][ 290/2180] lr: 1.9145e-04 eta: 2:55:36 time: 5.4757 data_time: 0.0060 memory: 11636 loss: 1.3335 grad_norm: 0.0419 2024/02/17 21:19:20 - mmengine - INFO - Epoch(train) [1][ 300/2180] lr: 1.9086e-04 eta: 2:54:35 time: 5.4913 data_time: 0.0059 memory: 11636 loss: 1.2078 grad_norm: 0.0419 2024/02/17 21:20:14 - mmengine - INFO - Epoch(train) [1][ 310/2180] lr: 1.9025e-04 eta: 2:53:33 time: 5.4720 data_time: 0.0063 memory: 11636 loss: 1.3232 grad_norm: 0.0408 2024/02/17 21:21:09 - mmengine - INFO - Epoch(train) [1][ 320/2180] lr: 1.8962e-04 eta: 2:52:33 time: 5.4874 data_time: 0.0061 memory: 11636 loss: 1.1613 grad_norm: 0.0399 2024/02/17 21:22:04 - mmengine - INFO - Epoch(train) [1][ 330/2180] lr: 1.8897e-04 eta: 2:51:33 time: 5.4976 data_time: 0.0055 memory: 11636 loss: 1.0618 grad_norm: 0.0399 2024/02/17 21:22:59 - mmengine - INFO - Epoch(train) [1][ 340/2180] lr: 1.8830e-04 eta: 2:50:34 time: 5.4955 data_time: 0.0055 memory: 11636 loss: 1.2110 grad_norm: 0.0397 2024/02/17 21:23:54 - mmengine - INFO - Epoch(train) [1][ 350/2180] lr: 1.8762e-04 eta: 2:49:34 time: 5.4803 data_time: 0.0073 memory: 11636 loss: 1.3728 grad_norm: 0.0397 2024/02/17 21:24:49 - mmengine - INFO - Epoch(train) [1][ 360/2180] lr: 1.8691e-04 eta: 2:48:35 time: 5.4913 data_time: 0.0061 memory: 11636 loss: 1.3247 grad_norm: 0.0409 2024/02/17 21:25:44 - mmengine - INFO - Epoch(train) [1][ 370/2180] lr: 1.8619e-04 eta: 2:47:36 time: 5.4998 data_time: 0.0035 memory: 11636 loss: 1.2394 grad_norm: 0.0418 2024/02/17 21:26:39 - mmengine - INFO - Epoch(train) [1][ 380/2180] lr: 1.8545e-04 eta: 2:46:37 time: 5.4798 data_time: 0.0080 memory: 11636 loss: 1.2876 grad_norm: 0.0418 2024/02/17 21:27:33 - mmengine - INFO - Epoch(train) [1][ 390/2180] lr: 1.8469e-04 eta: 2:45:38 time: 5.4822 data_time: 0.0043 memory: 11636 loss: 1.2384 grad_norm: 0.0415 2024/02/17 21:28:28 - mmengine - INFO - Epoch(train) [1][ 400/2180] lr: 1.8392e-04 eta: 2:44:40 time: 5.4852 data_time: 0.0045 memory: 11636 loss: 1.3028 grad_norm: 0.0395 2024/02/17 21:29:23 - mmengine - INFO - Epoch(train) [1][ 410/2180] lr: 1.8313e-04 eta: 2:43:41 time: 5.4775 data_time: 0.0051 memory: 11636 loss: 1.3837 grad_norm: 0.0395 2024/02/17 21:30:20 - mmengine - INFO - Epoch(train) [1][ 420/2180] lr: 1.8232e-04 eta: 2:42:51 time: 5.6697 data_time: 0.0048 memory: 11636 loss: 1.3023 grad_norm: 0.0370 2024/02/17 21:31:14 - mmengine - INFO - Epoch(train) [1][ 430/2180] lr: 1.8149e-04 eta: 2:41:51 time: 5.4514 data_time: 0.0074 memory: 11636 loss: 1.3021 grad_norm: 0.0370 2024/02/17 21:32:09 - mmengine - INFO - Epoch(train) [1][ 440/2180] lr: 1.8065e-04 eta: 2:40:52 time: 5.4650 data_time: 0.0061 memory: 11636 loss: 1.3124 grad_norm: 0.0354 2024/02/17 21:33:04 - mmengine - INFO - Epoch(train) [1][ 450/2180] lr: 1.7979e-04 eta: 2:39:54 time: 5.4793 data_time: 0.0052 memory: 11636 loss: 1.1623 grad_norm: 0.0357 2024/02/17 21:33:58 - mmengine - INFO - Epoch(train) [1][ 460/2180] lr: 1.7891e-04 eta: 2:38:56 time: 5.4763 data_time: 0.0058 memory: 11636 loss: 1.1687 grad_norm: 0.0357 2024/02/17 21:34:53 - mmengine - INFO - Epoch(train) [1][ 470/2180] lr: 1.7802e-04 eta: 2:37:59 time: 5.4865 data_time: 0.0095 memory: 11636 loss: 1.2130 grad_norm: 0.0356 2024/02/17 21:35:49 - mmengine - INFO - Epoch(train) [1][ 480/2180] lr: 1.7711e-04 eta: 2:37:02 time: 5.5190 data_time: 0.0044 memory: 11636 loss: 1.2919 grad_norm: 0.0356 2024/02/17 21:36:43 - mmengine - INFO - Epoch(train) [1][ 490/2180] lr: 1.7618e-04 eta: 2:36:05 time: 5.4815 data_time: 0.0064 memory: 11636 loss: 1.3736 grad_norm: 0.0356 2024/02/17 21:37:38 - mmengine - INFO - after_train_iter in EvaluateChatHook. 2024/02/17 21:37:54 - mmengine - INFO - Sample output: <s> <|User|>:请给我介绍五个上海的景点<eoh> <|Bot|>:上海是中国最大的城市之一,拥有许多著名的景点。以下是五个值得一游的景点: 1. 上海博物馆:这是一座历史悠久的博物馆,收藏了大量的文物和艺术品,包括中国古代青铜器、陶瓷、书画等。 2. 东方明珠塔:这是上海的标志性建筑之一,高达468米,是亚洲最高的电视塔。游客可以在塔上欣赏到整个城市的美丽景色。 3. 上海城隍庙:这是一座古老的庙宇,建于明朝,是上海最古老的庙宇之一。游客可以在这里感受到浓厚的历史气息。 4. 上海外滩:这是上海最著名的景点之一,位于黄浦江畔,是上海的象征之一。游客可以在这里欣赏到上海的美丽夜景。 5. 上海迪士尼乐园:这是一座大型主题公园,拥有许多刺激的游乐设施和精彩的表演。游客可以在这里度过一个愉快的假期。<eoa> </s> 2024/02/17 21:38:08 - mmengine - INFO - Sample output: <s> <|User|>:Please tell me five scenic spots in Shanghai<eoh> <|Bot|>:1. The Bund: This is a famous waterfront promenade that offers stunning views of the city's skyline and the Huangpu River. 2. Yu Garden: This is a traditional Chinese garden that dates back to the Ming Dynasty. It features beautiful pavilions, rock formations, and ponds. 3. Shanghai Tower: This is the tallest building in China and the second-tallest in the world. It offers panoramic views of the city from its observation deck. 4. Oriental Pearl Tower: This is another famous tower in Shanghai that offers a unique perspective of the city. It features a rotating restaurant and observation deck. 5. Zhujiajiao Water Town: This is a picturesque water town located just outside of Shanghai. It features traditional architecture, canals, and bridges, and is a great place to experience traditional Chinese culture.</s> 2024/02/17 21:38:08 - mmengine - INFO - Epoch(train) [1][ 500/2180] lr: 1.7524e-04 eta: 2:35:08 time: 5.4861 data_time: 0.0051 memory: 11636 loss: 1.1318 grad_norm: 0.0353 2024/02/17 21:39:13 - mmengine - INFO - Epoch(train) [1][ 510/2180] lr: 1.7428e-04 eta: 2:36:22 time: 9.5021 data_time: 3.0265 memory: 11636 loss: 1.1484 grad_norm: 0.0353 2024/02/17 21:40:13 - mmengine - INFO - Epoch(train) [1][ 520/2180] lr: 1.7331e-04 eta: 2:35:38 time: 5.9996 data_time: 0.0042 memory: 11636 loss: 1.2533 grad_norm: 0.0341 2024/02/17 21:41:12 - mmengine - INFO - Epoch(train) [1][ 530/2180] lr: 1.7232e-04 eta: 2:34:50 time: 5.9061 data_time: 0.0065 memory: 11636 loss: 1.2784 grad_norm: 0.0332 2024/02/17 21:42:08 - mmengine - INFO - Epoch(train) [1][ 540/2180] lr: 1.7132e-04 eta: 2:33:53 time: 5.5863 data_time: 0.0115 memory: 11636 loss: 1.0623 grad_norm: 0.0332 2024/02/17 21:43:03 - mmengine - INFO - Epoch(train) [1][ 550/2180] lr: 1.7030e-04 eta: 2:32:53 time: 5.5276 data_time: 0.0101 memory: 11636 loss: 1.1366 grad_norm: 0.0326 2024/02/17 21:43:59 - mmengine - INFO - Epoch(train) [1][ 560/2180] lr: 1.6927e-04 eta: 2:31:54 time: 5.5364 data_time: 0.0103 memory: 11636 loss: 1.2822 grad_norm: 0.0331 2024/02/17 21:44:54 - mmengine - INFO - Epoch(train) [1][ 570/2180] lr: 1.6822e-04 eta: 2:30:55 time: 5.5224 data_time: 0.0060 memory: 11636 loss: 1.1037 grad_norm: 0.0331 2024/02/17 21:45:49 - mmengine - INFO - Epoch(train) [1][ 580/2180] lr: 1.6716e-04 eta: 2:29:57 time: 5.5423 data_time: 0.0043 memory: 11636 loss: 1.1980 grad_norm: 0.0330 2024/02/17 21:46:47 - mmengine - INFO - Epoch(train) [1][ 590/2180] lr: 1.6609e-04 eta: 2:29:04 time: 5.7493 data_time: 0.0060 memory: 11636 loss: 1.1012 grad_norm: 0.0330 2024/02/17 21:47:42 - mmengine - INFO - Epoch(train) [1][ 600/2180] lr: 1.6500e-04 eta: 2:28:03 time: 5.4607 data_time: 0.0068 memory: 11636 loss: 1.0862 grad_norm: 0.0330 2024/02/17 21:48:36 - mmengine - INFO - Epoch(train) [1][ 610/2180] lr: 1.6390e-04 eta: 2:27:03 time: 5.4693 data_time: 0.0153 memory: 11636 loss: 1.2194 grad_norm: 0.0328 2024/02/17 21:49:31 - mmengine - INFO - Epoch(train) [1][ 620/2180] lr: 1.6278e-04 eta: 2:26:03 time: 5.4841 data_time: 0.0047 memory: 11636 loss: 1.2393 grad_norm: 0.0328 2024/02/17 21:50:26 - mmengine - INFO - Epoch(train) [1][ 630/2180] lr: 1.6165e-04 eta: 2:25:04 time: 5.4943 data_time: 0.0068 memory: 11636 loss: 1.4837 grad_norm: 0.0332 2024/02/17 21:51:21 - mmengine - INFO - Epoch(train) [1][ 640/2180] lr: 1.6051e-04 eta: 2:24:05 time: 5.4996 data_time: 0.0065 memory: 11636 loss: 1.3620 grad_norm: 0.0334 2024/02/17 21:52:16 - mmengine - INFO - Epoch(train) [1][ 650/2180] lr: 1.5936e-04 eta: 2:23:06 time: 5.4863 data_time: 0.0056 memory: 11636 loss: 1.2788 grad_norm: 0.0334 2024/02/17 21:53:11 - mmengine - INFO - Epoch(train) [1][ 660/2180] lr: 1.5819e-04 eta: 2:22:08 time: 5.5071 data_time: 0.0088 memory: 11636 loss: 1.1845 grad_norm: 0.0342 2024/02/17 21:54:06 - mmengine - INFO - Epoch(train) [1][ 670/2180] lr: 1.5702e-04 eta: 2:21:09 time: 5.4862 data_time: 0.0056 memory: 11636 loss: 1.2951 grad_norm: 0.0342 2024/02/17 21:55:01 - mmengine - INFO - Epoch(train) [1][ 680/2180] lr: 1.5583e-04 eta: 2:20:10 time: 5.4935 data_time: 0.0130 memory: 11636 loss: 1.1849 grad_norm: 0.0348 2024/02/17 21:55:57 - mmengine - INFO - Epoch(train) [1][ 690/2180] lr: 1.5462e-04 eta: 2:19:14 time: 5.6129 data_time: 0.0064 memory: 11636 loss: 1.3701 grad_norm: 0.0350 2024/02/17 21:56:52 - mmengine - INFO - Epoch(train) [1][ 700/2180] lr: 1.5341e-04 eta: 2:18:16 time: 5.4959 data_time: 0.0077 memory: 11636 loss: 1.1811 grad_norm: 0.0350 2024/02/17 21:57:47 - mmengine - INFO - Epoch(train) [1][ 710/2180] lr: 1.5219e-04 eta: 2:17:17 time: 5.4828 data_time: 0.0061 memory: 11636 loss: 1.1462 grad_norm: 0.0354 2024/02/17 21:58:42 - mmengine - INFO - Epoch(train) [1][ 720/2180] lr: 1.5095e-04 eta: 2:16:19 time: 5.4904 data_time: 0.0044 memory: 11636 loss: 1.6206 grad_norm: 0.0357 2024/02/17 21:59:37 - mmengine - INFO - Epoch(train) [1][ 730/2180] lr: 1.4971e-04 eta: 2:15:21 time: 5.5083 data_time: 0.0069 memory: 11636 loss: 1.2593 grad_norm: 0.0357 2024/02/17 22:00:33 - mmengine - INFO - Epoch(train) [1][ 740/2180] lr: 1.4845e-04 eta: 2:14:26 time: 5.6731 data_time: 0.0054 memory: 11636 loss: 1.1603 grad_norm: 0.0361 2024/02/17 22:01:28 - mmengine - INFO - Epoch(train) [1][ 750/2180] lr: 1.4719e-04 eta: 2:13:28 time: 5.4661 data_time: 0.0082 memory: 11636 loss: 1.3069 grad_norm: 0.0361 2024/02/17 22:02:23 - mmengine - INFO - Epoch(train) [1][ 760/2180] lr: 1.4591e-04 eta: 2:12:29 time: 5.4747 data_time: 0.0046 memory: 11636 loss: 1.2106 grad_norm: 0.0367 2024/02/17 22:03:18 - mmengine - INFO - Epoch(train) [1][ 770/2180] lr: 1.4463e-04 eta: 2:11:31 time: 5.4982 data_time: 0.0061 memory: 11636 loss: 1.1564 grad_norm: 0.0367 2024/02/17 22:04:12 - mmengine - INFO - Epoch(train) [1][ 780/2180] lr: 1.4333e-04 eta: 2:10:33 time: 5.4608 data_time: 0.0052 memory: 11636 loss: 1.3421 grad_norm: 0.0367 2024/02/17 22:05:11 - mmengine - INFO - Epoch(train) [1][ 790/2180] lr: 1.4203e-04 eta: 2:09:42 time: 5.8574 data_time: 0.0229 memory: 11636 loss: 1.2862 grad_norm: 0.0374 2024/02/17 22:06:06 - mmengine - INFO - Epoch(train) [1][ 800/2180] lr: 1.4072e-04 eta: 2:08:45 time: 5.5361 data_time: 0.0250 memory: 11636 loss: 1.3378 grad_norm: 0.0378 2024/02/17 22:07:01 - mmengine - INFO - Epoch(train) [1][ 810/2180] lr: 1.3940e-04 eta: 2:07:46 time: 5.4661 data_time: 0.0393 memory: 11636 loss: 1.3988 grad_norm: 0.0378 2024/02/17 22:07:58 - mmengine - INFO - Epoch(train) [1][ 820/2180] lr: 1.3807e-04 eta: 2:06:52 time: 5.6775 data_time: 0.0204 memory: 11636 loss: 1.2588 grad_norm: 0.0380 2024/02/17 22:08:52 - mmengine - INFO - Epoch(train) [1][ 830/2180] lr: 1.3673e-04 eta: 2:05:54 time: 5.4535 data_time: 0.0156 memory: 11636 loss: 1.0567 grad_norm: 0.0380 2024/02/17 22:09:47 - mmengine - INFO - Epoch(train) [1][ 840/2180] lr: 1.3539e-04 eta: 2:04:55 time: 5.4585 data_time: 0.0086 memory: 11636 loss: 1.3209 grad_norm: 0.0378 2024/02/17 22:10:42 - mmengine - INFO - Epoch(train) [1][ 850/2180] lr: 1.3404e-04 eta: 2:03:57 time: 5.4712 data_time: 0.0123 memory: 11636 loss: 1.4299 grad_norm: 0.0378 2024/02/17 22:11:36 - mmengine - INFO - Epoch(train) [1][ 860/2180] lr: 1.3268e-04 eta: 2:03:00 time: 5.4715 data_time: 0.0042 memory: 11636 loss: 1.2715 grad_norm: 0.0378 2024/02/17 22:12:31 - mmengine - INFO - Epoch(train) [1][ 870/2180] lr: 1.3131e-04 eta: 2:02:02 time: 5.4834 data_time: 0.0055 memory: 11636 loss: 1.2173 grad_norm: 0.0384 2024/02/17 22:13:26 - mmengine - INFO - Epoch(train) [1][ 880/2180] lr: 1.2994e-04 eta: 2:01:05 time: 5.4811 data_time: 0.0044 memory: 11636 loss: 1.2657 grad_norm: 0.0389 2024/02/17 22:14:21 - mmengine - INFO - Epoch(train) [1][ 890/2180] lr: 1.2856e-04 eta: 2:00:07 time: 5.4774 data_time: 0.0042 memory: 11636 loss: 1.1929 grad_norm: 0.0389 2024/02/17 22:15:16 - mmengine - INFO - Epoch(train) [1][ 900/2180] lr: 1.2718e-04 eta: 1:59:11 time: 5.5768 data_time: 0.0129 memory: 11636 loss: 1.2952 grad_norm: 0.0394 2024/02/17 22:16:11 - mmengine - INFO - Epoch(train) [1][ 910/2180] lr: 1.2579e-04 eta: 1:58:14 time: 5.4674 data_time: 0.0051 memory: 11636 loss: 1.1155 grad_norm: 0.0394 2024/02/17 22:17:06 - mmengine - INFO - Epoch(train) [1][ 920/2180] lr: 1.2439e-04 eta: 1:57:16 time: 5.4718 data_time: 0.0070 memory: 11636 loss: 1.2788 grad_norm: 0.0396 2024/02/17 22:18:01 - mmengine - INFO - Epoch(train) [1][ 930/2180] lr: 1.2299e-04 eta: 1:56:19 time: 5.5009 data_time: 0.0055 memory: 11636 loss: 1.2477 grad_norm: 0.0402 2024/02/17 22:18:56 - mmengine - INFO - Epoch(train) [1][ 940/2180] lr: 1.2159e-04 eta: 1:55:23 time: 5.5605 data_time: 0.0046 memory: 11636 loss: 1.1930 grad_norm: 0.0402 2024/02/17 22:19:51 - mmengine - INFO - Epoch(train) [1][ 950/2180] lr: 1.2018e-04 eta: 1:54:26 time: 5.4914 data_time: 0.0048 memory: 11636 loss: 1.2332 grad_norm: 0.0403 2024/02/17 22:20:46 - mmengine - INFO - Epoch(train) [1][ 960/2180] lr: 1.1877e-04 eta: 1:53:29 time: 5.4908 data_time: 0.0078 memory: 11636 loss: 1.4291 grad_norm: 0.0408 2024/02/17 22:21:41 - mmengine - INFO - Epoch(train) [1][ 970/2180] lr: 1.1735e-04 eta: 1:52:32 time: 5.4885 data_time: 0.0037 memory: 11636 loss: 1.2620 grad_norm: 0.0408 2024/02/17 22:22:36 - mmengine - INFO - Epoch(train) [1][ 980/2180] lr: 1.1593e-04 eta: 1:51:35 time: 5.4967 data_time: 0.0046 memory: 11636 loss: 1.1620 grad_norm: 0.0411 2024/02/17 22:23:31 - mmengine - INFO - Epoch(train) [1][ 990/2180] lr: 1.1450e-04 eta: 1:50:38 time: 5.5006 data_time: 0.0054 memory: 11636 loss: 1.2497 grad_norm: 0.0411 2024/02/17 22:24:26 - mmengine - INFO - after_train_iter in EvaluateChatHook. 2024/02/17 22:24:43 - mmengine - INFO - Sample output: <s> <|User|>:请给我介绍五个上海的景点<eoh> <|Bot|>:上海是中国最大的城市之一,拥有许多著名的景点。以下是五个值得一游的景点: 1. 上海博物馆:这是一座历史悠久的博物馆,收藏了大量的文物和艺术品,包括中国古代青铜器、陶瓷、书画等。 2. 东方明珠塔:这是上海的标志性建筑之一,高达468米,是亚洲最高的电视塔。游客可以在塔上欣赏到整个城市的美丽景色。 3. 上海城隍庙:这是一座古老的庙宇,建于明朝,是上海最古老的庙宇之一。游客可以在这里参观到许多古老的建筑和文物。 4. 上海外滩:这是上海最著名的景点之一,位于黄浦江畔,是上海的象征之一。游客可以在这里欣赏到整个城市的美丽景色,还可以看到许多历史建筑和现代化的摩天大楼。 5. 上海迪士尼乐园:这是一座世界级的主题公园,拥有许多刺激的游乐设施和精彩的表演。游客可以在这里度过一个充满乐趣和刺激的假期。<eoa> </s> 2024/02/17 22:24:57 - mmengine - INFO - Sample output: <s> <|User|>:Please tell me five scenic spots in Shanghai<eoh> <|Bot|>:1. The Bund: This is a famous waterfront promenade that offers stunning views of the city's skyline and the Huangpu River. 2. Yu Garden: This is a traditional Chinese garden that dates back to the Ming Dynasty. It features beautiful pavilions, rock formations, and ponds. 3. Shanghai Tower: This is the tallest building in China and the second-tallest in the world. It offers panoramic views of the city from its observation deck. 4. Oriental Pearl Tower: This is another famous tower in Shanghai that offers a unique perspective of the city. It features a rotating restaurant and observation deck. 5. Zhujiajiao Water Town: This is a picturesque water town located just outside of Shanghai. It features narrow canals, traditional architecture, and a variety of shops and restaurants.</s> 2024/02/17 22:24:57 - mmengine - INFO - Exp name: internlm_chat_7b_qlora_oasst1_e3_copy_20240217_204948 2024/02/17 22:24:57 - mmengine - INFO - Epoch(train) [1][1000/2180] lr: 1.1308e-04 eta: 1:49:42 time: 5.4946 data_time: 0.0054 memory: 11636 loss: 1.3355 grad_norm: 0.0417 2024/02/17 22:26:03 - mmengine - INFO - Epoch(train) [1][1010/2180] lr: 1.1165e-04 eta: 1:49:33 time: 9.6727 data_time: 3.1301 memory: 11636 loss: 1.0941 grad_norm: 0.0421 2024/02/17 22:27:03 - mmengine - INFO - Epoch(train) [1][1020/2180] lr: 1.1021e-04 eta: 1:48:42 time: 6.0440 data_time: 0.0043 memory: 11636 loss: 1.0665 grad_norm: 0.0421 2024/02/17 22:28:01 - mmengine - INFO - Epoch(train) [1][1030/2180] lr: 1.0878e-04 eta: 1:47:48 time: 5.7882 data_time: 0.0048 memory: 11636 loss: 1.1707 grad_norm: 0.0426 2024/02/17 22:28:58 - mmengine - INFO - Epoch(train) [1][1040/2180] lr: 1.0734e-04 eta: 1:46:52 time: 5.6964 data_time: 0.0041 memory: 11636 loss: 1.1467 grad_norm: 0.0426 2024/02/17 22:29:53 - mmengine - INFO - Epoch(train) [1][1050/2180] lr: 1.0591e-04 eta: 1:45:54 time: 5.4910 data_time: 0.0046 memory: 11636 loss: 1.3995 grad_norm: 0.0426 2024/02/17 22:30:50 - mmengine - INFO - Epoch(train) [1][1060/2180] lr: 1.0447e-04 eta: 1:44:59 time: 5.6717 data_time: 0.0067 memory: 11636 loss: 1.1320 grad_norm: 0.0429 2024/02/17 22:31:44 - mmengine - INFO - Epoch(train) [1][1070/2180] lr: 1.0303e-04 eta: 1:44:01 time: 5.4746 data_time: 0.0054 memory: 11636 loss: 1.2347 grad_norm: 0.0429 2024/02/17 22:32:40 - mmengine - INFO - Epoch(train) [1][1080/2180] lr: 1.0159e-04 eta: 1:43:04 time: 5.5073 data_time: 0.0031 memory: 11636 loss: 1.0782 grad_norm: 0.0431 2024/02/17 22:33:34 - mmengine - INFO - Epoch(train) [1][1090/2180] lr: 1.0014e-04 eta: 1:42:06 time: 5.4865 data_time: 0.0070 memory: 11636 loss: 1.2686 grad_norm: 0.0446 2024/02/17 22:34:29 - mmengine - INFO - Epoch(train) [1][1100/2180] lr: 9.8703e-05 eta: 1:41:08 time: 5.4894 data_time: 0.0166 memory: 11636 loss: 1.2025 grad_norm: 0.0446 2024/02/17 22:35:24 - mmengine - INFO - Epoch(train) [1][1110/2180] lr: 9.7262e-05 eta: 1:40:11 time: 5.4776 data_time: 0.0071 memory: 11636 loss: 1.2581 grad_norm: 0.0443 2024/02/17 22:36:19 - mmengine - INFO - Epoch(train) [1][1120/2180] lr: 9.5822e-05 eta: 1:39:14 time: 5.4990 data_time: 0.0137 memory: 11636 loss: 1.2228 grad_norm: 0.0441 2024/02/17 22:37:14 - mmengine - INFO - Epoch(train) [1][1130/2180] lr: 9.4383e-05 eta: 1:38:16 time: 5.5075 data_time: 0.0040 memory: 11636 loss: 1.3134 grad_norm: 0.0441 2024/02/17 22:38:09 - mmengine - INFO - Epoch(train) [1][1140/2180] lr: 9.2944e-05 eta: 1:37:19 time: 5.5149 data_time: 0.0063 memory: 11636 loss: 1.2779 grad_norm: 0.0439 2024/02/17 22:39:04 - mmengine - INFO - Epoch(train) [1][1150/2180] lr: 9.1508e-05 eta: 1:36:22 time: 5.4986 data_time: 0.0076 memory: 11636 loss: 1.2474 grad_norm: 0.0439 2024/02/17 22:39:59 - mmengine - INFO - Epoch(train) [1][1160/2180] lr: 9.0073e-05 eta: 1:35:25 time: 5.5031 data_time: 0.0047 memory: 11636 loss: 1.1690 grad_norm: 0.0443 2024/02/17 22:40:54 - mmengine - INFO - Epoch(train) [1][1170/2180] lr: 8.8640e-05 eta: 1:34:28 time: 5.4841 data_time: 0.0065 memory: 11636 loss: 1.1357 grad_norm: 0.0447 2024/02/17 22:41:50 - mmengine - INFO - Epoch(train) [1][1180/2180] lr: 8.7209e-05 eta: 1:33:31 time: 5.5388 data_time: 0.0046 memory: 11636 loss: 1.1510 grad_norm: 0.0447 2024/02/17 22:42:44 - mmengine - INFO - Epoch(train) [1][1190/2180] lr: 8.5781e-05 eta: 1:32:34 time: 5.4713 data_time: 0.0068 memory: 11636 loss: 1.3308 grad_norm: 0.0444 2024/02/17 22:43:39 - mmengine - INFO - Epoch(train) [1][1200/2180] lr: 8.4357e-05 eta: 1:31:37 time: 5.4919 data_time: 0.0062 memory: 11636 loss: 1.2840 grad_norm: 0.0447 2024/02/17 22:44:34 - mmengine - INFO - Epoch(train) [1][1210/2180] lr: 8.2935e-05 eta: 1:30:40 time: 5.5040 data_time: 0.0037 memory: 11636 loss: 1.3047 grad_norm: 0.0447 2024/02/17 22:45:31 - mmengine - INFO - Epoch(train) [1][1220/2180] lr: 8.1517e-05 eta: 1:29:44 time: 5.6646 data_time: 0.0081 memory: 11636 loss: 1.1132 grad_norm: 0.0448 2024/02/17 22:46:26 - mmengine - INFO - Epoch(train) [1][1230/2180] lr: 8.0102e-05 eta: 1:28:47 time: 5.4722 data_time: 0.0067 memory: 11636 loss: 1.0314 grad_norm: 0.0448 2024/02/17 22:47:21 - mmengine - INFO - Epoch(train) [1][1240/2180] lr: 7.8692e-05 eta: 1:27:50 time: 5.5339 data_time: 0.0107 memory: 11636 loss: 1.1496 grad_norm: 0.0447 2024/02/17 22:48:16 - mmengine - INFO - Epoch(train) [1][1250/2180] lr: 7.7287e-05 eta: 1:26:53 time: 5.4804 data_time: 0.0097 memory: 11636 loss: 1.1998 grad_norm: 0.0434 2024/02/17 22:49:11 - mmengine - INFO - Epoch(train) [1][1260/2180] lr: 7.5886e-05 eta: 1:25:56 time: 5.4956 data_time: 0.0119 memory: 11636 loss: 1.2681 grad_norm: 0.0434 2024/02/17 22:50:06 - mmengine - INFO - Epoch(train) [1][1270/2180] lr: 7.4489e-05 eta: 1:25:00 time: 5.4980 data_time: 0.0075 memory: 11636 loss: 1.1382 grad_norm: 0.0443 2024/02/17 22:51:01 - mmengine - INFO - Epoch(train) [1][1280/2180] lr: 7.3099e-05 eta: 1:24:03 time: 5.4979 data_time: 0.0069 memory: 11636 loss: 1.0867 grad_norm: 0.0440 2024/02/17 22:51:55 - mmengine - INFO - Epoch(train) [1][1290/2180] lr: 7.1714e-05 eta: 1:23:06 time: 5.4847 data_time: 0.0108 memory: 11636 loss: 1.2908 grad_norm: 0.0440 2024/02/17 22:52:50 - mmengine - INFO - Epoch(train) [1][1300/2180] lr: 7.0334e-05 eta: 1:22:09 time: 5.4841 data_time: 0.0048 memory: 11636 loss: 1.2362 grad_norm: 0.0441 2024/02/17 22:53:45 - mmengine - INFO - Epoch(train) [1][1310/2180] lr: 6.8961e-05 eta: 1:21:12 time: 5.4803 data_time: 0.0052 memory: 11636 loss: 1.3550 grad_norm: 0.0441 2024/02/17 22:54:40 - mmengine - INFO - Epoch(train) [1][1320/2180] lr: 6.7595e-05 eta: 1:20:16 time: 5.4882 data_time: 0.0062 memory: 11636 loss: 1.2114 grad_norm: 0.0443 2024/02/17 22:55:35 - mmengine - INFO - Epoch(train) [1][1330/2180] lr: 6.6235e-05 eta: 1:19:19 time: 5.4847 data_time: 0.0052 memory: 11636 loss: 1.3496 grad_norm: 0.0449 2024/02/17 22:56:31 - mmengine - INFO - Epoch(train) [1][1340/2180] lr: 6.4882e-05 eta: 1:18:23 time: 5.6110 data_time: 0.0043 memory: 11636 loss: 1.1001 grad_norm: 0.0449 2024/02/17 22:57:26 - mmengine - INFO - Epoch(train) [1][1350/2180] lr: 6.3536e-05 eta: 1:17:26 time: 5.4901 data_time: 0.0041 memory: 11636 loss: 1.3057 grad_norm: 0.0451 2024/02/17 22:58:21 - mmengine - INFO - Epoch(train) [1][1360/2180] lr: 6.2198e-05 eta: 1:16:30 time: 5.4985 data_time: 0.0043 memory: 11636 loss: 1.2348 grad_norm: 0.0454 2024/02/17 22:59:16 - mmengine - INFO - Epoch(train) [1][1370/2180] lr: 6.0868e-05 eta: 1:15:33 time: 5.4809 data_time: 0.0049 memory: 11636 loss: 1.3222 grad_norm: 0.0454 2024/02/17 23:00:13 - mmengine - INFO - Epoch(train) [1][1380/2180] lr: 5.9546e-05 eta: 1:14:38 time: 5.7016 data_time: 0.0054 memory: 11636 loss: 1.1608 grad_norm: 0.0465 2024/02/17 23:01:07 - mmengine - INFO - Epoch(train) [1][1390/2180] lr: 5.8232e-05 eta: 1:13:41 time: 5.4297 data_time: 0.0041 memory: 11636 loss: 1.2747 grad_norm: 0.0465 2024/02/17 23:02:02 - mmengine - INFO - Epoch(train) [1][1400/2180] lr: 5.6927e-05 eta: 1:12:44 time: 5.4551 data_time: 0.0042 memory: 11636 loss: 1.3461 grad_norm: 0.0479 2024/02/17 23:02:56 - mmengine - INFO - Epoch(train) [1][1410/2180] lr: 5.5631e-05 eta: 1:11:47 time: 5.4767 data_time: 0.0049 memory: 11636 loss: 1.2046 grad_norm: 0.0495 2024/02/17 23:03:51 - mmengine - INFO - Epoch(train) [1][1420/2180] lr: 5.4344e-05 eta: 1:10:51 time: 5.4686 data_time: 0.0043 memory: 11636 loss: 1.4160 grad_norm: 0.0495 2024/02/17 23:04:46 - mmengine - INFO - Epoch(train) [1][1430/2180] lr: 5.3067e-05 eta: 1:09:54 time: 5.4773 data_time: 0.0051 memory: 11636 loss: 1.3273 grad_norm: 0.0495 2024/02/17 23:05:41 - mmengine - INFO - Epoch(train) [1][1440/2180] lr: 5.1799e-05 eta: 1:08:58 time: 5.4947 data_time: 0.0038 memory: 11636 loss: 1.1627 grad_norm: 0.0500 2024/02/17 23:06:36 - mmengine - INFO - Epoch(train) [1][1450/2180] lr: 5.0542e-05 eta: 1:08:01 time: 5.4844 data_time: 0.0036 memory: 11636 loss: 1.2138 grad_norm: 0.0500 2024/02/17 23:07:29 - mmengine - INFO - Epoch(train) [1][1460/2180] lr: 4.9294e-05 eta: 1:07:04 time: 5.2988 data_time: 0.0036 memory: 11636 loss: 1.2284 grad_norm: 0.0502 2024/02/17 23:08:19 - mmengine - INFO - Epoch(train) [1][1470/2180] lr: 4.8058e-05 eta: 1:06:05 time: 5.0266 data_time: 0.0041 memory: 11636 loss: 1.1328 grad_norm: 0.0502 2024/02/17 23:09:10 - mmengine - INFO - Epoch(train) [1][1480/2180] lr: 4.6832e-05 eta: 1:05:07 time: 5.1301 data_time: 0.0042 memory: 11636 loss: 1.1505 grad_norm: 0.0501 2024/02/17 23:10:01 - mmengine - INFO - Epoch(train) [1][1490/2180] lr: 4.5617e-05 eta: 1:04:09 time: 5.0857 data_time: 0.0047 memory: 11636 loss: 1.3214 grad_norm: 0.0498 2024/02/17 23:10:53 - mmengine - INFO - after_train_iter in EvaluateChatHook. 2024/02/17 23:11:11 - mmengine - INFO - Sample output: <s> <|User|>:请给我介绍五个上海的景点<eoh> <|Bot|>:上海是中国最大的城市之一,拥有许多著名的景点。以下是五个值得一游的景点: 1. 上海博物馆:这是一座历史悠久的博物馆,收藏了大量的文物和艺术品,包括中国古代青铜器、陶瓷、书画等。 2. 东方明珠塔:这是上海的标志性建筑之一,高达468米,是亚洲最高的电视塔。游客可以在塔上欣赏到整个城市的美丽景色。 3. 上海城隍庙:这是一座古老的庙宇,建于明朝,是上海最古老的庙宇之一。游客可以在这里参观到许多古老的建筑和文物。 4. 上海外滩:这是上海最著名的景点之一,位于黄浦江畔,是上海的象征之一。游客可以在这里欣赏到整个城市的美丽景色,还可以看到许多历史建筑和现代化的摩天大楼。 5. 上海迪士尼乐园:这是一座世界著名的主题公园,位于上海浦东新区。游客可以在这里体验到许多刺激的游乐设施和精彩的表演,还可以与迪士尼的卡通人物合影留念。</s> 2024/02/17 23:11:24 - mmengine - INFO - Sample output: <s> <|User|>:Please tell me five scenic spots in Shanghai<eoh> <|Bot|>:1. The Bund: This is a famous waterfront promenade in Shanghai that offers stunning views of the city's skyline and the Huangpu River. 2. Yu Garden: This is a traditional Chinese garden located in the heart of Shanghai. It features beautiful pavilions, rock formations, and water features. 3. Oriental Pearl Tower: This is a modern landmark in Shanghai that offers panoramic views of the city from its observation deck. 4. Shanghai Tower: This is the tallest building in China and the second-tallest in the world. It offers breathtaking views of the city from its observation deck. 5. Zhujiajiao Water Town: This is a picturesque water town located just outside of Shanghai. It features narrow canals, traditional architecture, and a variety of shops and restaurants.</s> 2024/02/17 23:11:24 - mmengine - INFO - Epoch(train) [1][1500/2180] lr: 4.4413e-05 eta: 1:03:12 time: 5.2075 data_time: 0.0184 memory: 11636 loss: 1.1746 grad_norm: 0.0498 2024/02/17 23:12:27 - mmengine - INFO - Epoch(train) [1][1510/2180] lr: 4.3221e-05 eta: 1:02:33 time: 9.4123 data_time: 3.1212 memory: 11636 loss: 1.3023 grad_norm: 0.0496 2024/02/17 23:13:27 - mmengine - INFO - Epoch(train) [1][1520/2180] lr: 4.2041e-05 eta: 1:01:38 time: 5.9369 data_time: 0.0142 memory: 11636 loss: 1.1354 grad_norm: 0.0494 2024/02/17 23:14:24 - mmengine - INFO - Epoch(train) [1][1530/2180] lr: 4.0872e-05 eta: 1:00:43 time: 5.7390 data_time: 0.0055 memory: 11636 loss: 1.1870 grad_norm: 0.0494 2024/02/17 23:15:20 - mmengine - INFO - Epoch(train) [1][1540/2180] lr: 3.9716e-05 eta: 0:59:47 time: 5.6254 data_time: 0.0040 memory: 11636 loss: 1.2049 grad_norm: 0.0487 2024/02/17 23:16:16 - mmengine - INFO - Epoch(train) [1][1550/2180] lr: 3.8573e-05 eta: 0:58:51 time: 5.5665 data_time: 0.0055 memory: 11636 loss: 1.2489 grad_norm: 0.0487 2024/02/17 23:17:11 - mmengine - INFO - Epoch(train) [1][1560/2180] lr: 3.7442e-05 eta: 0:57:54 time: 5.5250 data_time: 0.0046 memory: 11636 loss: 1.2970 grad_norm: 0.0474 2024/02/17 23:18:06 - mmengine - INFO - Epoch(train) [1][1570/2180] lr: 3.6324e-05 eta: 0:56:58 time: 5.5240 data_time: 0.0055 memory: 11636 loss: 1.2150 grad_norm: 0.0461 2024/02/17 23:19:01 - mmengine - INFO - Epoch(train) [1][1580/2180] lr: 3.5220e-05 eta: 0:56:02 time: 5.5079 data_time: 0.0081 memory: 11636 loss: 1.1859 grad_norm: 0.0461 2024/02/17 23:19:56 - mmengine - INFO - Epoch(train) [1][1590/2180] lr: 3.4129e-05 eta: 0:55:05 time: 5.5058 data_time: 0.0066 memory: 11636 loss: 1.3507 grad_norm: 0.0463 2024/02/17 23:20:52 - mmengine - INFO - Epoch(train) [1][1600/2180] lr: 3.3051e-05 eta: 0:54:09 time: 5.5820 data_time: 0.0074 memory: 11636 loss: 1.2383 grad_norm: 0.0461 2024/02/17 23:21:47 - mmengine - INFO - Epoch(train) [1][1610/2180] lr: 3.1988e-05 eta: 0:53:13 time: 5.4801 data_time: 0.0087 memory: 11636 loss: 1.3592 grad_norm: 0.0461 2024/02/17 23:22:42 - mmengine - INFO - Epoch(train) [1][1620/2180] lr: 3.0938e-05 eta: 0:52:16 time: 5.4974 data_time: 0.0208 memory: 11636 loss: 1.1963 grad_norm: 0.0463 2024/02/17 23:23:37 - mmengine - INFO - Epoch(train) [1][1630/2180] lr: 2.9903e-05 eta: 0:51:20 time: 5.4825 data_time: 0.0088 memory: 11636 loss: 1.3850 grad_norm: 0.0463 2024/02/17 23:24:32 - mmengine - INFO - Epoch(train) [1][1640/2180] lr: 2.8883e-05 eta: 0:50:23 time: 5.4846 data_time: 0.0080 memory: 11636 loss: 1.2132 grad_norm: 0.0464 2024/02/17 23:25:27 - mmengine - INFO - Epoch(train) [1][1650/2180] lr: 2.7877e-05 eta: 0:49:27 time: 5.4914 data_time: 0.0061 memory: 11636 loss: 1.1677 grad_norm: 0.0466 2024/02/17 23:26:22 - mmengine - INFO - Epoch(train) [1][1660/2180] lr: 2.6886e-05 eta: 0:48:31 time: 5.4949 data_time: 0.0085 memory: 11636 loss: 1.3552 grad_norm: 0.0466 2024/02/17 23:27:17 - mmengine - INFO - Epoch(train) [1][1670/2180] lr: 2.5911e-05 eta: 0:47:34 time: 5.5018 data_time: 0.0034 memory: 11636 loss: 1.2760 grad_norm: 0.0474 2024/02/17 23:28:12 - mmengine - INFO - Epoch(train) [1][1680/2180] lr: 2.4951e-05 eta: 0:46:38 time: 5.4976 data_time: 0.0053 memory: 11636 loss: 1.2833 grad_norm: 0.0478 2024/02/17 23:29:07 - mmengine - INFO - Epoch(train) [1][1690/2180] lr: 2.4006e-05 eta: 0:45:42 time: 5.4972 data_time: 0.0034 memory: 11636 loss: 1.3471 grad_norm: 0.0478 2024/02/17 23:30:02 - mmengine - INFO - Epoch(train) [1][1700/2180] lr: 2.3077e-05 eta: 0:44:46 time: 5.4939 data_time: 0.0044 memory: 11636 loss: 1.1369 grad_norm: 0.0477 2024/02/17 23:30:58 - mmengine - INFO - Epoch(train) [1][1710/2180] lr: 2.2165e-05 eta: 0:43:50 time: 5.6706 data_time: 0.0039 memory: 11636 loss: 1.2459 grad_norm: 0.0477 2024/02/17 23:31:53 - mmengine - INFO - Epoch(train) [1][1720/2180] lr: 2.1268e-05 eta: 0:42:54 time: 5.4781 data_time: 0.0043 memory: 11636 loss: 1.0744 grad_norm: 0.0485 2024/02/17 23:32:48 - mmengine - INFO - Epoch(train) [1][1730/2180] lr: 2.0388e-05 eta: 0:41:57 time: 5.4837 data_time: 0.0042 memory: 11636 loss: 1.3386 grad_norm: 0.0483 2024/02/17 23:33:43 - mmengine - INFO - Epoch(train) [1][1740/2180] lr: 1.9524e-05 eta: 0:41:01 time: 5.4826 data_time: 0.0081 memory: 11636 loss: 0.9797 grad_norm: 0.0483 2024/02/17 23:34:38 - mmengine - INFO - Epoch(train) [1][1750/2180] lr: 1.8677e-05 eta: 0:40:05 time: 5.4946 data_time: 0.0054 memory: 11636 loss: 1.2409 grad_norm: 0.0485 2024/02/17 23:35:33 - mmengine - INFO - Epoch(train) [1][1760/2180] lr: 1.7847e-05 eta: 0:39:09 time: 5.4964 data_time: 0.0044 memory: 11636 loss: 1.1225 grad_norm: 0.0493 2024/02/17 23:36:28 - mmengine - INFO - Epoch(train) [1][1770/2180] lr: 1.7034e-05 eta: 0:38:13 time: 5.4947 data_time: 0.0052 memory: 11636 loss: 1.3994 grad_norm: 0.0493 2024/02/17 23:37:23 - mmengine - INFO - Epoch(train) [1][1780/2180] lr: 1.6238e-05 eta: 0:37:16 time: 5.5082 data_time: 0.0051 memory: 11636 loss: 1.3055 grad_norm: 0.0493 2024/02/17 23:38:17 - mmengine - INFO - Epoch(train) [1][1790/2180] lr: 1.5459e-05 eta: 0:36:20 time: 5.4692 data_time: 0.0047 memory: 11636 loss: 1.2004 grad_norm: 0.0493 2024/02/17 23:39:12 - mmengine - INFO - Epoch(train) [1][1800/2180] lr: 1.4698e-05 eta: 0:35:24 time: 5.4820 data_time: 0.0074 memory: 11636 loss: 1.1879 grad_norm: 0.0497 2024/02/17 23:40:07 - mmengine - INFO - Epoch(train) [1][1810/2180] lr: 1.3955e-05 eta: 0:34:28 time: 5.4788 data_time: 0.0301 memory: 11636 loss: 1.2325 grad_norm: 0.0495 2024/02/17 23:41:02 - mmengine - INFO - Epoch(train) [1][1820/2180] lr: 1.3230e-05 eta: 0:33:32 time: 5.4864 data_time: 0.0213 memory: 11636 loss: 1.1186 grad_norm: 0.0495 2024/02/17 23:41:57 - mmengine - INFO - Epoch(train) [1][1830/2180] lr: 1.2523e-05 eta: 0:32:36 time: 5.4957 data_time: 0.0063 memory: 11636 loss: 1.3410 grad_norm: 0.0493 2024/02/17 23:42:52 - mmengine - INFO - Epoch(train) [1][1840/2180] lr: 1.1833e-05 eta: 0:31:40 time: 5.5050 data_time: 0.0038 memory: 11636 loss: 1.0078 grad_norm: 0.0489 2024/02/17 23:43:47 - mmengine - INFO - Epoch(train) [1][1850/2180] lr: 1.1163e-05 eta: 0:30:44 time: 5.4952 data_time: 0.0031 memory: 11636 loss: 0.9936 grad_norm: 0.0489 2024/02/17 23:44:42 - mmengine - INFO - Epoch(train) [1][1860/2180] lr: 1.0510e-05 eta: 0:29:48 time: 5.4954 data_time: 0.0041 memory: 11636 loss: 1.2902 grad_norm: 0.0490 2024/02/17 23:45:37 - mmengine - INFO - Epoch(train) [1][1870/2180] lr: 9.8763e-06 eta: 0:28:52 time: 5.5004 data_time: 0.0051 memory: 11636 loss: 1.2188 grad_norm: 0.0490 2024/02/17 23:46:32 - mmengine - INFO - Epoch(train) [1][1880/2180] lr: 9.2612e-06 eta: 0:27:56 time: 5.5028 data_time: 0.0059 memory: 11636 loss: 1.2737 grad_norm: 0.0487 2024/02/17 23:47:27 - mmengine - INFO - Epoch(train) [1][1890/2180] lr: 8.6650e-06 eta: 0:27:00 time: 5.4963 data_time: 0.0046 memory: 11636 loss: 1.1800 grad_norm: 0.0491 2024/02/17 23:48:22 - mmengine - INFO - Epoch(train) [1][1900/2180] lr: 8.0877e-06 eta: 0:26:04 time: 5.4987 data_time: 0.0092 memory: 11636 loss: 1.2071 grad_norm: 0.0491 2024/02/17 23:49:17 - mmengine - INFO - Epoch(train) [1][1910/2180] lr: 7.5295e-06 eta: 0:25:08 time: 5.4994 data_time: 0.0070 memory: 11636 loss: 1.5696 grad_norm: 0.0491 2024/02/17 23:50:12 - mmengine - INFO - Epoch(train) [1][1920/2180] lr: 6.9906e-06 eta: 0:24:12 time: 5.4985 data_time: 0.0063 memory: 11636 loss: 1.3959 grad_norm: 0.0491 2024/02/17 23:51:07 - mmengine - INFO - Epoch(train) [1][1930/2180] lr: 6.4709e-06 eta: 0:23:16 time: 5.5110 data_time: 0.0053 memory: 11636 loss: 1.2003 grad_norm: 0.0491 2024/02/17 23:52:02 - mmengine - INFO - Epoch(train) [1][1940/2180] lr: 5.9706e-06 eta: 0:22:20 time: 5.4867 data_time: 0.0057 memory: 11636 loss: 1.1168 grad_norm: 0.0493 2024/02/17 23:52:57 - mmengine - INFO - Epoch(train) [1][1950/2180] lr: 5.4899e-06 eta: 0:21:24 time: 5.5148 data_time: 0.0058 memory: 11636 loss: 1.1945 grad_norm: 0.0493 2024/02/17 23:53:52 - mmengine - INFO - Epoch(train) [1][1960/2180] lr: 5.0288e-06 eta: 0:20:28 time: 5.5197 data_time: 0.0050 memory: 11636 loss: 1.2541 grad_norm: 0.0488 2024/02/17 23:54:47 - mmengine - INFO - Epoch(train) [1][1970/2180] lr: 4.5875e-06 eta: 0:19:32 time: 5.4825 data_time: 0.0094 memory: 11636 loss: 1.2075 grad_norm: 0.0492 2024/02/17 23:55:42 - mmengine - INFO - Epoch(train) [1][1980/2180] lr: 4.1659e-06 eta: 0:18:36 time: 5.4734 data_time: 0.0037 memory: 11636 loss: 1.2364 grad_norm: 0.0492 2024/02/17 23:56:37 - mmengine - INFO - Epoch(train) [1][1990/2180] lr: 3.7643e-06 eta: 0:17:40 time: 5.5883 data_time: 0.0077 memory: 11636 loss: 1.0466 grad_norm: 0.0493 2024/02/17 23:57:32 - mmengine - INFO - after_train_iter in EvaluateChatHook. 2024/02/17 23:57:47 - mmengine - INFO - Sample output: <s> <|User|>:请给我介绍五个上海的景点<eoh> <|Bot|>:上海是中国最大的城市之一,拥有许多著名的景点。以下是五个值得一游的景点: 1. 上海博物馆:这是一座历史悠久的博物馆,收藏了大量的文物和艺术品,包括中国古代青铜器、陶瓷、书画等。 2. 东方明珠塔:这是上海的标志性建筑之一,高达468米,是亚洲最高的电视塔。游客可以在塔上欣赏到整个城市的美丽景色。 3. 上海城隍庙:这是一座古老的庙宇,建于明朝,是上海最古老的庙宇之一。游客可以在这里感受到中国传统文化的魅力。 4. 上海外滩:这是上海最著名的景点之一,位于黄浦江畔,是上海的象征之一。游客可以在这里欣赏到上海的美丽夜景。 5. 上海迪士尼乐园:这是一座世界级的主题公园,拥有许多刺激的游乐设施和精彩的表演。游客可以在这里度过一个愉快的假期。</s> 2024/02/17 23:58:02 - mmengine - INFO - Sample output: <s> <|User|>:Please tell me five scenic spots in Shanghai<eoh> <|Bot|>:1. The Bund: This is a famous waterfront promenade that offers stunning views of the city's skyline and the Huangpu River. 2. Yu Garden: This is a traditional Chinese garden that dates back to the Ming Dynasty. It features beautiful pavilions, rock formations, and ponds. 3. Shanghai Tower: This is the tallest building in China and the second-tallest in the world. It offers panoramic views of the city from its observation deck. 4. Oriental Pearl Tower: This is another famous tower in Shanghai that offers a unique perspective of the city. It features a rotating restaurant and observation deck. 5. Zhujiajiao Water Town: This is a picturesque water town located just outside of Shanghai. It features traditional architecture, canals, and bridges, and is a great place to experience traditional Chinese culture.</s> 2024/02/17 23:58:02 - mmengine - INFO - Exp name: internlm_chat_7b_qlora_oasst1_e3_copy_20240217_204948 2024/02/17 23:58:02 - mmengine - INFO - Epoch(train) [1][2000/2180] lr: 3.3826e-06 eta: 0:16:44 time: 5.4714 data_time: 0.0052 memory: 11636 loss: 1.0977 grad_norm: 0.0501 2024/02/17 23:59:06 - mmengine - INFO - Epoch(train) [1][2010/2180] lr: 3.0210e-06 eta: 0:15:52 time: 9.4120 data_time: 2.9652 memory: 11636 loss: 1.3637 grad_norm: 0.0501 2024/02/18 00:00:08 - mmengine - INFO - Epoch(train) [1][2020/2180] lr: 2.6795e-06 eta: 0:14:56 time: 6.2104 data_time: 0.0063 memory: 11636 loss: 1.1587 grad_norm: 0.0506 2024/02/18 00:01:06 - mmengine - INFO - Epoch(train) [1][2030/2180] lr: 2.3583e-06 eta: 0:14:00 time: 5.7750 data_time: 0.0040 memory: 11636 loss: 1.1819 grad_norm: 0.0506 2024/02/18 00:02:03 - mmengine - INFO - Epoch(train) [1][2040/2180] lr: 2.0573e-06 eta: 0:13:04 time: 5.6492 data_time: 0.0046 memory: 11636 loss: 1.1101 grad_norm: 0.0498 2024/02/18 00:02:58 - mmengine - INFO - Epoch(train) [1][2050/2180] lr: 1.7767e-06 eta: 0:12:08 time: 5.5742 data_time: 0.0040 memory: 11636 loss: 1.2090 grad_norm: 0.0496 2024/02/18 00:03:54 - mmengine - INFO - Epoch(train) [1][2060/2180] lr: 1.5164e-06 eta: 0:11:12 time: 5.5272 data_time: 0.0052 memory: 11636 loss: 1.1258 grad_norm: 0.0496 2024/02/18 00:04:49 - mmengine - INFO - Epoch(train) [1][2070/2180] lr: 1.2767e-06 eta: 0:10:16 time: 5.5131 data_time: 0.0040 memory: 11636 loss: 1.1409 grad_norm: 0.0493 2024/02/18 00:05:44 - mmengine - INFO - Epoch(train) [1][2080/2180] lr: 1.0574e-06 eta: 0:09:20 time: 5.5060 data_time: 0.0039 memory: 11636 loss: 1.2263 grad_norm: 0.0487 2024/02/18 00:06:39 - mmengine - INFO - Epoch(train) [1][2090/2180] lr: 8.5865e-07 eta: 0:08:24 time: 5.5179 data_time: 0.0073 memory: 11636 loss: 1.1349 grad_norm: 0.0487 2024/02/18 00:07:30 - mmengine - INFO - Epoch(train) [1][2100/2180] lr: 6.8051e-07 eta: 0:07:28 time: 5.0656 data_time: 0.0039 memory: 11636 loss: 1.0562 grad_norm: 0.0486 2024/02/18 00:08:19 - mmengine - INFO - Epoch(train) [1][2110/2180] lr: 5.2299e-07 eta: 0:06:31 time: 4.8983 data_time: 0.0045 memory: 11636 loss: 1.2508 grad_norm: 0.0486 2024/02/18 00:09:05 - mmengine - INFO - Epoch(train) [1][2120/2180] lr: 3.8613e-07 eta: 0:05:35 time: 4.6767 data_time: 0.0067 memory: 11636 loss: 1.1917 grad_norm: 0.0486 2024/02/18 00:09:53 - mmengine - INFO - Epoch(train) [1][2130/2180] lr: 2.6996e-07 eta: 0:04:39 time: 4.7660 data_time: 0.0111 memory: 11636 loss: 1.3690 grad_norm: 0.0492 2024/02/18 00:10:41 - mmengine - INFO - Epoch(train) [1][2140/2180] lr: 1.7450e-07 eta: 0:03:43 time: 4.7619 data_time: 0.0042 memory: 11636 loss: 1.3171 grad_norm: 0.0492 2024/02/18 00:11:29 - mmengine - INFO - Epoch(train) [1][2150/2180] lr: 9.9772e-08 eta: 0:02:47 time: 4.7938 data_time: 0.0035 memory: 11636 loss: 1.2915 grad_norm: 0.0488 2024/02/18 00:12:18 - mmengine - INFO - Epoch(train) [1][2160/2180] lr: 4.5789e-08 eta: 0:01:51 time: 4.9155 data_time: 0.0060 memory: 11636 loss: 1.2895 grad_norm: 0.0488 2024/02/18 00:13:07 - mmengine - INFO - Epoch(train) [1][2170/2180] lr: 1.2564e-08 eta: 0:00:55 time: 4.9406 data_time: 0.0041 memory: 11636 loss: 1.2730 grad_norm: 0.0488 2024/02/18 00:13:57 - mmengine - INFO - Exp name: internlm_chat_7b_qlora_oasst1_e3_copy_20240217_204948 2024/02/18 00:13:57 - mmengine - INFO - Epoch(train) [1][2180/2180] lr: 1.0384e-10 eta: 0:00:00 time: 4.9730 data_time: 0.0034 memory: 11636 loss: 1.1705 grad_norm: 0.0532 2024/02/18 00:13:57 - mmengine - INFO - Saving checkpoint at 1 epochs 2024/02/18 00:13:59 - mmengine - INFO - after_train in EvaluateChatHook. 2024/02/18 00:14:17 - mmengine - INFO - Sample output: <s> <|User|>:请给我介绍五个上海的景点<eoh> <|Bot|>:上海是中国最大的城市之一,拥有许多著名的景点。以下是五个值得一游的景点: 1. 上海博物馆:这是一座历史悠久的博物馆,收藏了大量的文物和艺术品,包括中国古代青铜器、陶瓷、书画等。 2. 东方明珠塔:这是上海的标志性建筑之一,高达468米,是亚洲最高的电视塔。游客可以在塔上欣赏到整个城市的美丽景色。 3. 上海城隍庙:这是一座古老的庙宇,建于明朝,是上海最古老的庙宇之一。游客可以在这里参观到许多古老的建筑和文物。 4. 上海外滩:这是上海最著名的景点之一,位于黄浦江畔,是上海的象征之一。游客可以在这里欣赏到整个城市的美丽景色,还可以看到许多历史建筑和现代化的摩天大楼。 5. 上海迪士尼乐园:这是一座世界著名的主题公园,位于上海浦东新区。游客可以在这里体验到许多刺激的游乐设施和精彩的表演,还可以与迪士尼的卡通人物合影留念。</s> 2024/02/18 00:14:31 - mmengine - INFO - Sample output: <s> <|User|>:Please tell me five scenic spots in Shanghai<eoh> <|Bot|>:1. The Bund: This is a famous waterfront promenade that offers stunning views of the city's skyline and the Huangpu River. 2. Yu Garden: This is a traditional Chinese garden that dates back to the Ming Dynasty. It features beautiful pavilions, rock formations, and a pond. 3. Shanghai Tower: This is the tallest building in China and the second-tallest in the world. It offers panoramic views of the city from its observation deck. 4. Oriental Pearl Tower: This is another famous tower in Shanghai that offers a unique perspective of the city. It features a rotating restaurant and observation deck. 5. Zhujiajiao Water Town: This is a picturesque water town located just outside of Shanghai. It features narrow canals, traditional architecture, and a variety of shops and restaurants.</s>
启动deepspeed加速,并将max_epochs设置为2
将模型文件转成hf格式:
mkdir hf export MKL_SERVICE_FORCE_INTEL=1 export MKL_THREADING_LAYER=GNU xtuner convert pth_to_hf ./internlm_chat_7b_qlora_oasst1_e3_copy.py ./work_dirs_20140217/internlm_chat_7b_qlora_oasst1_e3_copy/epoch_1.pth ./hf
屏幕输出:
(xtuner0.1.9) root@intern-studio-069640:~# cd ft-oasst1/ (xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# ls internlm-chat-7b internlm_chat_7b_qlora_oasst1_e3_copy.py openassistant-guanaco work_dirs work_dirs_20140217 (xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# mkdir hf (xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# export MKL_SERVICE_FORCE_INTEL=1 (xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# export MKL_THREADING_LAYER=GNU (xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# xtuner convert pth_to_hf ./internlm_chat_7b_qlora_oasst1_e3_copy.py ./work_dirs_20140217/internlm_chat_7b_qlora_oasst1_e3_copy/epoch_1.pth ./hf /root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( [2024-02-18 09:06:19,611] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) /root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( [2024-02-18 09:06:35,773] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'> Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:25<00:00, 3.17s/it] 02/18 09:07:08 - mmengine - INFO - dispatch internlm attn forward 02/18 09:07:08 - mmengine - WARNING - Due to the implementation of the PyTorch version of flash attention, even when the `output_attentions` flag is set to True, it is not possible to return the `attn_weights`. Load PTH model from ./work_dirs_20140217/internlm_chat_7b_qlora_oasst1_e3_copy/epoch_1.pth Convert weights to float16 Saving HuggingFace model to ./hf /root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/peft/utils/save_and_load.py:148: UserWarning: Could not find a config file in ./internlm-chat-7b - will assume that the vocabulary was not modified. warnings.warn( All done! (xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1#
将hf lora增量模型和并到internlm 7b的基座模型:
xtuner convert merge ./internlm-chat-7b ./hf ./merged --max-shard-size 2GB
屏幕输出:
(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# xtuner convert merge ./internlm-chat-7b ./hf ./merged --max-shard-size 2GB /root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( [2024-02-18 09:26:33,485] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) /root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:13<00:00, 1.63s/it] Saving to ./merged... All done! (xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# ls hf internlm-chat-7b internlm_chat_7b_qlora_oasst1_e3_copy.py merged openassistant-guanaco work_dirs work_dirs_20140217 (xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1#
与合并后的模型对话:
xtuner chat ./merged --prompt-template internlm_chat
输出:
posted on 2024-02-17 08:49 littlesuccess 阅读(88) 评论(0) 编辑 收藏 举报