BigCode StarCoder系列模型
StarCoderBase
HF: https://huggingface.co/bigcode/starcoderbase
Training dataset: The Stack v1.2
Orchestration: bigcode/Megatron-LM
Neural networks: PyTorch
#Paramters: 15.5B
#TrainingTokens: 1T
#ContextWindow : 8192 tokens
#GPUs: 512 Tesla A100
#TrainingTime: 24 days
Language: 80+ Programming languages
StarCoder
Fine-tuned from StarCoderBase
, on 35B Python tokens from same Python dataset with 2 epochs.
HF: https://huggingface.co/bigcode/starcoder
Languages: 80+ Programming languages
refs
StarCoder-Megatron
Megatron-version of StarCoder
HF: https://huggingface.co/bigcode/starcoder-megatron
Language: 80+ Programming languages
StarCoder Plus
Fine-tuned from StarCoderBase
on 600B tokens from the English web dataset RedefinedWeb combined with StarCoderData from The Stack (v1.2) and a Wikipedia dataset.
HF: https://huggingface.co/bigcode/starcoderplus
Language: English & 80+ Programming languages
StarChat Alpha
Fine-tuned from StarCoderBase
, on a blend of oasst1
and databricks-dolly-15k
datasets.
HF: https://huggingface.co/HuggingFaceH4/starchat-alpha
GitHub: https://github.com/bigcode-project/starcoder
StarChat Beta
Fine-tuned from StarCoderPlus
, on an "uncensored" variant of the openassistant-guanaco
dataset.
HF: https://huggingface.co/HuggingFaceH4/starchat-beta
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 一个费力不讨好的项目,让我损失了近一半的绩效!
· 清华大学推出第四讲使用 DeepSeek + DeepResearch 让科研像聊天一样简单!
· 实操Deepseek接入个人知识库
· CSnakes vs Python.NET:高效嵌入与灵活互通的跨语言方案对比
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库