摘要: 目录概符号说明GaLore Zhao J., Zhang Z., Chen B., Wang Z., Anandkumar A. and Tian Y. GaLore: Memory-efficient llm training by gradient low-rank projection. IC 阅读全文
posted @ 2024-08-27 16:05 馒头and花卷 阅读(55) 评论(0) 推荐(0) 编辑
摘要: 目录概BAdam代码 Luo Q., Yu H. and Li X. BAdam: A memory efficient full parameter optimization method for large language models. arXiv preprint, 2024. 概 本文介 阅读全文
posted @ 2024-08-27 10:12 馒头and花卷 阅读(67) 评论(0) 推荐(0) 编辑