Rapt AI and AMD work to make GPU utilization more efficient

Rapt AI, a provider of AI-powered AI-workload automation for GPUs and AI accelerators, has teamed with AMD to enhance AI infrastructure.

The long-term strategic collaboration aims to improve AI inference and training workload management and performance on AMD Instinct GPUs, offering customers a scalable and cost-effective solution for deploying AI applications.

As AI adoption accelerates, organizations are grappling with resource allocation, performance bottlenecks, and complex GPU management.

By integrating Rapt’s intelligent workload automation platform with AMD Instinct MI300X, MI325X and upcoming MI350 series GPUs, this collaboration delivers a scalable, high-performance, and cost-effective solution that enables customers to maximize AI inference and training efficiency across on-premises and multi-cloud infrastructures.

A more efficient solution

AMD Instinct MI325X GPU.

Charlie Leeming, CEO of Rapt AI, said in a press briefing, “The AI models we are seeing today are so large and most importantly are so dynamic and unpredictable. The older tools for optimizing don’t really fit at all. We observed these dynamics. Enterprises are throwing lots of money. Hiring a new set of talent in AI. It’s one of these disruptive technologies. We have a scenario where CFOs and CIOs are asking where is the return. In some cases, there is tens of millions, hundreds of millions or billions of dollars spend on GPU-related infrastructure.”

Leeming said Anil Ravindranath, CTO of Rapt AI, saw the solution. And that involved deploying monitors to enable observations of the infrastructure.

“We feel we have the right solution at the right time. We came out of stealth last fall. We are in a growing number of Fortune 100 companies. Two are running the code among cloud service providers,” Leeming said.

And he said, “We do have strategic partners but our conversations with AMD went extremely well. They are building tremendous GPUs, AI accelerators. We are known for putting the maximum amount of workload on GPUs. Inference is taking off. It’s in production stage now. AI workloads are exploding. Their data scientists are running as fast as they can. They are panicking, they need tools, they need efficiency, they need automation. It’s screaming for the right solution. Inefficiencies — 30% GPU underutilization. Customers do want flexibility. Large customers are asking if you support AMD.”

Improvements that can take nine hours can be done in three minutes, he said. Ravindranath said in a press briefing the Rapt AI platform enables up to 10 times model run capacity at the same AI compute spending level, up to 90% cost savings, and zero humans in a loop and no code changes. For productivity, this means no more waiting for compute and time spent tuning infrastructure.

Lemming said other techniques have been around for a while and haven’t cut it. Run AI, a rival, overlaps in a competitive way somewhat. He said his company observes in minutes instead of hours and then optimizes the infrastructure. Ravindranath said Run AI is more like a scheduler but Rapt AI positions itself for unpredictable results and deals with it.

“We run the model and figure it out, and that’s a huge benefit for inference workloads. It should just automatically run,” Ravindranath said.

The benefits: lower costs, better GPU utilization

The companies said that AMD Instinct GPUs, with their industry-leading memory capacity, combined withRapt’s intelligent resource optimization, helps ensure maximum GPU utilization for AI workloads, helping lower total cost of ownership (TCO).

Rapt’s platform streamlines GPU management, eliminating the need for data scientists to spend valuable time on trial-and-error infrastructure configurations. By automatically optimizing resource allocation for their specific workloads, it empowers them to focus on innovation rather than infrastructure. It seamlessly supports diverse GPU environments (AMD and others, whether in the cloud, on premises or both) through a single instance, helping ensure maximum infrastructure flexibility.

The combined solution intelligently optimizes job density and resource allocation on AMD Instinct GPUs, resulting in better inference performance and scalability for production AI deployments. Rapt’s auto-scaling capabilities further help ensure efficient resource use based on demand, reducing latency and maximizing cost efficiency.

Rapt’s platform works out-of-the-box with AMD Instinct GPUs, helping ensure immediate performance benefits. Ongoing collaboration between Rapt and AMD will drive further optimizations in exciting areas such as GPU scheduling, memory utilization and more, helping ensure customers are equipped with a future ready AI infrastructure.

“At AMD, we are committed to delivering high-performance, scalable AI solutions that empower organizations to unlock the full potential of their AI workloads.” said Negin Oliver, corporate vice president of business development for data center GPU business at AMD, in a statement. “Our collaboration with Rapt AI combines the cutting-edge capabilities of AMD Instinct GPUs with Rapt’s intelligent workload automation, enabling customers to achieve greater efficiency, flexibility, and cost savings across their AI infrastructure.”

GB Daily

Stay in the know! Get the latest news in your inbox daily

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link