Google GShard

Name: Google GShard
Brand: AskGPT.ai: Leading Directory of AI & GPT Innovations, Apps, and Companies
SKU: 3539
Availability: InStock

In this paper, Google presents their work on scaling giant language translation models with 600 billion parameters trained on 2048 TPU v3 cores. To cope with the challenges of training these large scale models such as computation cost, ease of programming and efficient implementation on parallel devices, GShard was developed. This is a module composed of lightweight annotation APIs and an extension to the XLA compiler that enables us to express a wide range of parallel computations patterns with minimal changes to existing model code. With GShard we were able to train a multilingual neural machine translation Transformer model with Sparsely-Gated Mixture-of-Experts beyond 600 billion parameters using automatic sharding in just 4 days on 2048 TPU v3 accelerators, achieving far superior quality for translations from 100 languages into English compared to previous methods.

Google GShard

You May Also Like.

GPT-3 Road Trip Plans for 2021 by CarMax

Talk To Kanye – Yebot

Slidekick

Question Base

Share Your Valuable Opinions Cancel Reply

Product Information 01

Tags

Company

Cart (0)

Cart (0)

Login

Login

Google GShard

You May Also Like.

GPT-3 Road Trip Plans for 2021 by CarMax

Talk To Kanye – Yebot

Slidekick

Question Base

Share Your Valuable Opinions Cancel Reply

Product Information 01

Tags

Company

Cart (0)

Cart (0)