ALBERT

Google · September 2019

activeOpen Sourceencoder onlytext

Description

Google's 'A Lite BERT' that dramatically reduced BERT's parameter count through cross-layer parameter sharing and factorized embedding parameterization — achieving comparable performance with 18× fewer parameters.

Key Innovations

parameter-sharing
Masked LM
Masked LMTraining by randomly hiding words and having the model predict them — BERT's key innovation for understanding context.

Family Tree

Built On

Lineage

BERTALBERT

External Links