BERT — AI Glossary

BERT (Bidirectional Encoder Representations from Transformers) is a pretrained language model that reads text in both directions to understand context. It powers many NLP tasks like search and question answering.

In depth

BERT was introduced by Google researchers in 2018 as a breakthrough in natural language processing. Unlike earlier models that processed text left‑to‑right or right‑to‑left only, BERT’s bidirectional design lets each word see the full context of the sentence, leading to a deeper understanding of meaning.

At its core, BERT stacks multiple Transformer encoder layers and is trained using two self‑supervised objectives: masked language modeling (randomly hiding words and predicting them) and next sentence prediction (learning whether two sentences follow each other). This pretraining on large text corpora yields a general‑purpose language representation that can be fine‑tuned for specific downstream tasks with relatively little additional data.

The impact of BERT has been enormous: it set new state‑of‑the‑art results on benchmarks like GLUE and SQuAD, and it enabled a shift toward large‑scale pretrained models that can be adapted to many applications. Its success paved the way for later models such as RoBERTa, ALBERT, and the broader family of foundation models that now underpin modern AI systems.

Examples

["Google Search uses BERT to better interpret the intent behind user queries, improving results for complex or conversational questions.","Customer service chatbots fine‑tune BERT to accurately detect sentiment and extract key information from support tickets.","Biomedical researchers apply a BERT variant (BioBERT) to extract drug‑disease relationships from scientific literature."]

Related terms

Transformer
Masked Language Model
Fine-tuning