摘要: GPT系列 GPT2 The GPT-2 is built using transformer decoder blocks. BERT, on the other hand, uses transformer encoder blocks. auto-regressive: outputs one 阅读全文
posted @ 2023-02-07 16:43 鱼与鱼 阅读(135) 评论(0) 推荐(0) 编辑