Amazon Unveils AI-Language Model That Outperforms GPT-3 And PaLM

September 9, 2022

AI researchers from Amazon have published a new AI model that could improve its voice assistant Alexa.

As outlined in a paper, the Alexa Teacher Model 20B (AlexaTM 20B) sequence-to-sequence model boasts 20 billion parameters. It supports multiple languages, including Arabic, Hindi, Japanese, Tamil and Spanish.

Unlike OpenAI’s GPT-3, which uses a decoder-only approach, AlexaTM 20B uses an encoder-decoder architecture. Compared with rival models, this allows it to improve effectiveness in tasks such as text summarization and machine translation.

Regarding capabilities, Amazon’s researchers suggest it outperforms GPT-3 when it comes to linguistic tasks. The model is also capable of few-shot or low-shot learning — where an AI model uses less training data to reduce ML costs. Depending on the input, AlexaTM 20B can generalise the task to other languages familiar to it.

“At Alexa AI, we are moving to the new paradigm of generalisable intelligence, in which models can learn new concepts and transfer knowledge from one language or task to another with minimal human input,” commented Saleh Soltan, a Senior Applied Scientist with Alexa AI. “Such models allow us to efficiently develop new features and improve Alexa on multiple languages simultaneously.”

Amazon’s AI team plans to further evaluate the model by benchmarking it with different public datasets such as MultiATIS, mTOP and MASSIVE.

The researchers also want to make greater use of dialogue and user context, experiment with code-switching and examine varying levels of automated speech recognition noise.

“Overall, our results present a compelling case for seq2seq (sequence-to-sequence) models as a powerful alternative to decoder-only models for large-scale language model training,” according to the paper.

Amazon Unveils AI-Language Model That Outperforms GPT-3 And PaLM

AI researchers from Amazon have published a new AI model that could improve its voice assistant Alexa.

Latest Posts

OpenAI’s o3-Pro Is Here; Open-Weights Model Delayed

Mistral AI Unveils Its First Reasoning Model

Meta’s Zuckerberg Hiring for New ‘Superintelligence’ AI Team: Report

Apple Says AI Models Collapse When Facing Hard Puzzles

Meta in Talks to Invest in Scale AI

Reddit Sues Anthropic Over Alleged Data Scraping for AI Training