It is a method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. The technique involves training a large model, based on the Transformer architecture, on an unlabeled text corpus and then fine tuning it for specific NLP tasks like question answering or sentiment analysis. BERT’s ability to process both left and right context in all layers allows the model to learn bidirectional representations of words instead of just unidirectional ones as had been done previously. This makes it ideal for tasks such as natural language understanding that require knowledge about both context and meaning.