At the Google I/O 2021, on Tuesday, the tech giant made a number of announcements, including a new AI that talked like a real person, and custom AI chips, which CEO Sundar Pichai said were a “historic milestone” for the company.
AI conversations get more conversational with LaMDA
On Tuesday, it unveiled its latest language breakthrough, the development of a conversational language model called LaMDA (Language Model for Dialogue Applications). The new natural language processing technique makes conversations with AI models more resilient to unusual or unexpected queries, making it more like a real person and less like a voice interface for a search function.
Like other recently-developed language models, including BERT and GPT-3, LaMDA is built on Transformer, the neural network architecture that Google Research invented and open-sourced in 2017.
However, unlike other language models, Google’s LaMDA was trained on dialogue, teaching it how to engage in free-flowing conversations. This training taught LaMDA to deliver responses that not only make sense given the context but are specific.
Google gave an example with the prompt, “I just started taking guitar lessons.” A sensible and specific response might be, “How exciting! My mom has a vintage Martin that she loves to play.”
Google is also exploring how to add dimensions to responses, such as “interestingness,” which could include insightful, unexpected, or witty responses. It’s also working on ensuring that responses are factually correct and meet Google’s AI principles.
Vertex collects machine learning development tools in one place
Google announced the general availability of Vertex AI, a managed platform designed to help data scientists and ML engineers build, deploy and manage ML projects.
While Google has a bevy of machine learning products and services, which compete with other platforms such as AWS’s SageMaker, Google contends that the tools on the market are often incomplete.
Vertex AI aims to enable highly scalable workflows, as well as access to MLOps tools to maintain and manage models in production. It also promises to speed up the time it takes to build and train models. The platform brings together the Google Cloud services for building ML under one unified UI and API. Working in a single environment should make it easier to move models from experimentation, discover trends and make predictions.
Vertex AI gives teams access to the AI tools Google uses internally for computer vision, language, conversation and structured data. The toolkit will be regularly improved by Google Research.
It also includes new MLOps features like Vertex Vizier, an optimisation service. Customers give Vertex Vizier a set of variables, as well as the function or metric they’re trying to optimise, to ensure a model is tuned.
The fully-managed Vertex Feature Store helps users share and reuse ML features. By connecting features to tools like ML pipelines, users could set up workflows.
The platform’s MLOps tools, including Vertex Continuous Monitoring and Vertex Pipelines, eliminate the do-it-yourself maintenance often required for models in production.
The platform also lends itself to a broad range of skill levels, from business analysts using AutoML capabilities to sophisticated data science.
There’s a new generation of Google’s custom AI chips
Google says that it has designed a new AI chip that’s more than twice as fast as its previous version. TPU V4 (TPU stands for Tensor Processing Unit) reaches an entirely new height in computing performance for AI software running in Google data centres. A single TPU V4 pod, a cluster of interconnected servers combining about 500 of these processors is capable of 1 exaFLOP performance, Google CEO Sundar Pichai said in his live-streamed Google IO keynote Tuesday morning.
That’s almost twice the peak performance of Fugaku, the Japanese system at the top of the latest Top500 list of the world’s fastest supercomputers. Fugaku’s peak performance is about 540,000 teraFLOPS.
Google has said in the past, however, that its TPU pods operate at lower numerical precision than traditional supercomputers do, making it easier to get to such speeds. Deep-learning models for things like speech or image recognition don’t require calculations nearly as precise as traditional supercomputer workloads, used to do things like simulating the behaviour of human organs or calculating space-shuttle trajectories.
“This is the fastest system we’ve ever deployed at Google and a historic milestone for us,” Pichai said.
A big part of what makes a TPU pod so fast is the interconnect technology that turns hundreds of individual processors into a single system. TPU pod features “10x interconnect bandwidth per chip at scale than any other networking technology,” said Pichai.
TPU V4 pods will be deployed at Google data centers “soon,” Pichai added. Google announced its first custom AI chip, designed in house, in 2016. TPU V4 instances will be available to Google Cloud customers later this year.