NVIDIA AI Foundry Builds Custom Llama 3.1 GenAI Models for the World’s Enterprises

NVIDIA AI Foundry Builds Custom Llama 3.1 GenAI Models for the World’s Enterprises

Enterprises and nations can now build ‘Supermodels’ with NVIDIA AI Foundry using their own data paired with Llama 3.1 405B and NVIDIA Nemotron models.

NVIDIA announced a new NVIDIA AI Foundry service and NVIDIA NIM™ inference microservices to supercharge generative AI for the world’s enterprises with the Llama 3.1 collection of openly available models.

With NVIDIA AI Foundry, enterprises and nations can now create custom “supermodels” for their domain-specific industry use cases using Llama 3.1 and NVIDIA software, computing and expertise. Enterprises can train these supermodels with proprietary data as well as synthetic data generated from Llama 3.1 405B and the NVIDIA Nemotron™ Reward model.

NVIDIA AI Foundry is powered by the NVIDIA DGX™ Cloud AI platform, which is co-engineered with the world’s leading public clouds, to give enterprises significant compute resources that easily scale as AI demands change.

The new offerings come at a time when enterprises, as well as nations developing sovereign AI strategies, want to build custom large language models with domain-specific knowledge for generative AI applications that reflect their unique business or culture.

Meta’s openly available Llama 3.1 models mark a pivotal moment for the adoption of generative AI within the world’s enterprises,” said Jensen Huang, Founder and CEO of NVIDIA. “Llama 3.1 opens the floodgates for every enterprise and industry to build state-of-the-art generative AI applications. NVIDIA AI Foundry has integrated Llama 3.1 throughout and is ready to help enterprises build and deploy custom Llama supermodels.”

“The new Llama 3.1 models are a super-important step for open source AI,” said Mark Zuckerberg, Founder and CEO of Meta. “With NVIDIA AI Foundry, companies can easily create and customise the state-of-the-art AI services people want and deploy them with NVIDIA NIM. I’m excited to get this in people’s hands.”

To supercharge enterprise deployments of Llama 3.1 models for production AI, NVIDIA NIM inference microservices for Llama 3.1 models are now available for download from ai.nvidia.com. NIM microservices are the fastest way to deploy Llama 3.1 models in production and power up to 2.5x higher throughput than running inference without NIM.

Enterprises can pair Llama 3.1 NIM microservices with new NVIDIA NeMo Retriever NIM microservices to create state-of-the-art retrieval pipelines for AI copilots, assistants and digital human avatars.