NVIDIA AI Foundry Builds Custom Llama 3.1 Generative AI Models for the World's Enterprises

NVIDIA AI Foundry Builds Custom Llama 3.1 Generative AI Models for the World's Enterprises

NVIDIA launched AI Foundry and NIM microservices to enhance generative AI for enterprises using the new Llama 3.1 models, enabling custom supermodels.

Jesse Anglen
July 24, 2024

looking for a development partner?

Connect with technology leaders today!

Schedule Free Call

NVIDIA today announced a new service and NVIDIA NIM inference microservices to supercharge for the world’s enterprises with the Llama 3.1 collection of openly available models, also introduced today.


With NVIDIA AI Foundry, enterprises and nations can now create custom “supermodels” for their domain-specific industry use cases using Llama 3.1 and NVIDIA software, computing, and expertise. Enterprises can train these supermodels with proprietary data as well as synthetic data generated from Llama 3.1 405B and the NVIDIA Nemotron Reward model.


NVIDIA AI Foundry is powered by the NVIDIA DGX Cloud AI platform, which is co-engineered with the world’s leading public clouds, to give enterprises significant compute resources that easily scale as AI demands change. The new offerings come at a time when enterprises, as well as nations developing sovereign AI strategies, want to build custom large language models with domain-specific knowledge for generative AI applications that reflect their unique business or culture.


“Meta’s openly available Llama 3.1 models mark a pivotal moment for the adoption of generative AI within the world’s enterprises,” said Jensen Huang, founder and CEO of NVIDIA. “Llama 3.1 opens the floodgates for every enterprise and industry to build state-of-the-art generative AI applications. NVIDIA AI Foundry has integrated Llama 3.1 throughout and is ready to help enterprises build and deploy custom Llama supermodels.”


“The new Llama 3.1 models are a super-important step for open source AI,” said Mark Zuckerberg, founder and CEO of Meta. “With NVIDIA AI Foundry, companies can easily create and customize the state-of-the-art AI services people want and deploy them with NVIDIA NIM. I’m excited to get this in people’s hands.”


To supercharge enterprise deployments of Llama 3.1 models for production AI, NVIDIA NIM inference microservices for Llama 3.1 models are now available for download from ai.nvidia.com. NIM microservices are the fastest way to deploy Llama 3.1 models in production and power up to 2.5x higher throughput than running inference without NIM.


Enterprises can pair Llama 3.1 NIM microservices with new NVIDIA NeMo Retriever NIM microservices to create state-of-the-art retrieval pipelines for AI copilots, assistants, and digital human avatars.


Accenture Pioneers Custom Llama Supermodels for Enterprises With AI Foundry


NVIDIA has announced the launch of its new NVIDIA AI Foundry service along with NVIDIA NIM™ inference microservices, aimed at revolutionizing generative AI capabilities for enterprises worldwide. The initiative features the Llama 3.1 collection of openly available models, introduced to provide businesses with advanced AI tools.


With the NVIDIA AI Foundry, enterprises and nations can now build bespoke 'supermodels' tailored to their specific industry needs using Llama 3.1 and NVIDIA's technology. These models can be trained with proprietary and synthetic data generated from Llama 3.1 405B and the NVIDIA Nemotron™ Reward model.


The AI Foundry is powered by the NVIDIA DGX™ Cloud AI platform, co-engineered with leading public cloud providers, offering scalable compute resources to meet evolving AI demands. This service aims to support enterprises and nations in developing sovereign AI strategies and custom large language models (LLMs) for domain-specific applications.


Accenture is the first to leverage NVIDIA AI Foundry to create custom Llama 3.1 models for its clients. Companies like Aramco, AT&T, and Uber are among the early adopters of the new Llama NVIDIA NIM microservices, indicating a strong interest across various industries.


“Meta’s openly available Llama 3.1 models mark a pivotal moment for the adoption of generative AI within the world’s enterprises,” said Jensen Huang, founder and CEO of NVIDIA. “Llama 3.1 opens the floodgates for every enterprise and industry to build state-of-the-art generative AI applications. NVIDIA AI Foundry has integrated Llama 3.1 throughout and is ready to help enterprises build and deploy custom Llama supermodels.”


NVIDIA NIM inference microservices for Llama 3.1 are now available for download, promising up to 2.5x higher throughput compared to traditional inference methods. Enterprises can also pair these with new NVIDIA NeMo Retriever NIM microservices to create advanced AI retrieval pipelines for digital assistants and human avatars.


Accenture, utilizing its AI Refinery™ framework, is pioneering the use of NVIDIA AI Foundry to develop custom Llama 3.1 models. “The world’s leading enterprises see how generative AI is transforming every industry and are eager to deploy applications powered by custom models,” said Julie Sweet, chair and CEO of Accenture. “Accenture has been working with NVIDIA NIM inference microservices for our internal AI applications, and now, using NVIDIA AI Foundry, we can help clients quickly create and deploy custom Llama 3.1 models to power transformative AI applications for their own business priorities.”


NVIDIA AI Foundry offers an end-to-end service that includes model curation, synthetic data generation, fine-tuning, retrieval, and evaluation. Enterprises can use Llama 3.1 models and the NVIDIA NeMo platform to create domain-specific models, with the option to generate synthetic data to enhance model accuracy.


For more insights on generative AI and its applications, visit Rapid Innovation Blogs.


Top Trends

Latest News

Get Custom Software Solutions &
Project Estimates with Confidentiality!

Let’s spark the Idea