BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

本文是对BERT本文的翻译和名词透析

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova (Google AI Language)

Abstract

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models (Peters et al., 2018a; Radford et al., 2018), BERT is designed to pre-train deep bidirectional representations from the unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be finetuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications.
BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 points absolute improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 points absolute improvement).

名词透析

Transformer: 一种语言表示模型
empirically: by means of observation or experience rather than theory or pure logic.

Introduction

ChangeLog

2022/1/10 20:32 未完待续……

posted @ 2022-01-11 20:06 千心阅读(50) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

相关博文：

· Attention Is All You Need

· SEIRD model

· Bert解读

· 什么是BERT

· 论文笔记[5] BERT论文梳理&模型原理详解

阅读排行：
· 分享一个免费、快速、无限量使用的满血 DeepSeek R1 模型，支持深度思考和联网搜索！
· 基于 Docker 搭建 FRP 内网穿透开源项目（很简单哒）
· ollama系列1：轻松3步本地部署deepseek，普通电脑可用
· 按钮权限的设计及实现
· 【杂谈】分布式事务——高大上的无用知识？

公告

欢迎来到兴趣使然的无名小站!
博客文章除特别声明外，均采用CC BY-NC-ND 4.0共享协议!

昵称：千心
园龄： 4年9个月
粉丝： 1
关注： 0

+加关注

2025年3月

日

一

二

三

四

五

六

千心的无名小站

兴趣使然的无名小站 What we do may be small, but it has a certain character of permanence.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Abstract

名词透析

Introduction

ChangeLog

公告

搜索

常用链接

我的标签

随笔档案

阅读排行榜