Google Cloud, Google’s cloud computing services platform, announced a multi-year collaboration with startup Cohere to “accelerate natural language processing (NLP) to businesses by making it more cost-effectively.”
Under the partnership, Google Cloud says it’ll help Cohere establish computing infrastructure to power Cohere’s API, enabling Cohere to train large language models on dedicated hardware.
The news comes a day after Cohere announced the general availability of its API, which lets customers access models that are fine-tuned for a range of natural language applications — in some cases at a fraction of the cost of rival offerings. “Leading companies around the world are using AI to fundamentally transform their business processes and deliver more helpful customer experiences,” Google Cloud CEO Thomas Kurian said in a statement. “Our work with Cohere will make it easier and more cost-effective for any organisation to realise the possibilities of AI with powerful NLP services powered by Google’s custom-designed [hardware].”
Unlike some of its competitors, Cohere offers two types of English NLP models, generation and representation, in large, medium, and small sizes. The generation models can complete tasks involving generating text — for example, writing product descriptions or extracting document metadata. By contrast, the representative models are about understanding language, driving apps like semantic search, chatbots, and sentiment analysis.
To keep its technology relatively affordable, Cohere charges access on a per-character basis based on the size of the model and the number of characters apps use. Only the generated models charge on input and output characters, while other models charge on output characters. Meanwhile, all fine-tuned models — models tailored to particular domains, industries, or scenarios — are set at two times the baseline model rate.
Also Read: Why Language Matters
Large language models
The partnership with Google Cloud will grant Cohere access to dedicated fourth-generation tensor processing units (TPUs) running in Google Cloud instances. TPUs are custom chips explicitly developed to accelerate AI training, powering products like Google Search, Google Photos, Google Translate, Google Assistant, Gmail, and Google Cloud AI APIs.
“The partnership will run until the end of 2024 with options to extend into 2025 and 2026. Google Cloud and Cohere have plans to partner on a go-to-market strategy,” said Gomez. “We met with several Cloud providers and felt that Google Cloud was best positioned to meet our needs.”
Cohere’s decision to partner with Google Cloud reflects the logistical challenges of developing large language models. For example, Nvidia’s recently released Megatron 530B model was trained across 560 Nvidia DGX A100 servers, each hosting 8 Nvidia A100 80GB GPUs. Microsoft and Nvidia say they observed between 113 to 126 teraflops per second per GPU while training Megatron 530B, which would put the training cost in the millions of dollars.
Inference — actually running the trained model — is another challenge. On two of its costly DGX SuperPod systems, Nvidia claims that inference ( autocompleting a sentence) with Megatron 530B only takes half a second. But it can take over a minute on a CPU-based on-premises server. While cloud alternatives might be cheaper, they’re not dramatically so — one estimate pegs the cost of running GPT-3 on a single Amazon Web Services instance at a minimum of $87,000 per year.
Cohere rival OpenAI trains its large language models on an “AI supercomputer” hosted by Microsoft, which invested over $1 billion in the company in 2020, roughly $500 million of which came in the form of Azure compute credits.
In Cohere, Google Cloud — which already offered a range of NLP services — gains a customer in a market that’s increasing during the pandemic. According to a 2021 survey from John Snow Labs and Gradient Flow, 60 per cent of tech leaders indicated that their NLP budgets grew by at least 10 per cent compared to 2020, while a third — 33 per cent — said their spending climbed by more than 30 per cent.
“We’re dedicated to supporting companies, such as Cohere, through our advanced infrastructure offering to drive innovation in NLP,” Google Cloud AI director of product management Craig Wiley told VentureBeat via email. “Our goal is always to provide the best pipeline tools for developers of NLP models. By bringing together the NLP expertise from both Cohere and Google Cloud, we are going to be able to provide customers with some pretty extraordinary outcomes.”
The global NLP market will be worth $2.53 billion by 2027, up from $703 million in 2020. And if the current trend holds, a substantial portion of that spending will be put toward cloud infrastructure — benefiting Google Cloud.