News

Models and Scripts

BERT pre-training on BooksCorpus and English Wikipedia with mixed precision and gradient accumulation on GPUs. We achieved the following fine-tuning results based on the produced checkpoint on validation sets(#482, #505, #489). Thank you @haven-jeon

Dataset

MRPC

SQuAD 1.1

SST-2

MNLI-mm

Score

87.99%

80.99/88.60

93%

83.6%

BERT fine-tuning on various sentence classification datasets with checkpoints converted from the official repository(#600, #571, #481). Thank you @kenjewu@haven-jeon

Dataset

MRPC

RTE

SST-2

MNLI-m/mm

Score

88.7%

70.8%

93%

84.55%, 84.66%

BERT fine-tuning on question answering datasets with checkpoints converted from the official repository(#493). Thank you @fierceX

Dataset

SQuAD 1.1

SQuAD 1.1

SQuAD 2.0

Model

bert_12_768_12

bert_24_1024_16

bert_24_1024_16

F1/EM

88.53/80.98

90.97/84.05

77.96/81.02

BERT model convertion scripts for checkpoints from the original tensorflow repository, and more converted models(#456, #461, #449). Thank you @fierceX:

Multilingual Wikipedia (cased, BERT Base)

Chinese Wikipedia (cased, BERT Base)

Books Corpus & English Wikipedia (uncased, BERT Large)

Scripts and command line interface for BERT embedding of raw sentences(#587, #618). Thank you @imgarylai