Stay ahead with our daily and ​weekly ⁢newsletters packed with the latest‌ updates and ⁣exclusive content on cutting-edge AI ⁤advancements. Learn⁣ More


Cost remains a primary concern of enterprise AI usage and it’s a challenge that AWS⁢ is⁢ tackling head-on.

At the AWS:reinvent 2024 ​conference today, the cloud giant announced HyperPod ​Task Governance, a⁢ sophisticated solution targeting one of‌ the most expensive inefficiencies in enterprise ⁤AI operations: underutilized GPU⁢ resources.

According to⁣ AWS, HyperPod Task Governance can increase AI ​accelerator utilization, helping enterprises to ⁢optimize ‌AI ⁢costs and producing potentially significant​ savings.

“This innovation ⁤helps you maximize computer resource utilization by ‌automating the prioritization and management of ⁢these Gen AI tasks, reducing the cost⁣ by up⁢ to 40%,” said Swami Sivasubramanian, VP of AI⁣ and Data at AWS.

End GPU idle time

As organizations rapidly scale their AI initiatives, many are discovering a costly paradox. Despite heavy ⁢investments​ in GPU infrastructure to power various AI workloads,​ including training, fine tuning and inference, ⁤these expensive ⁣computing resources frequently sit idle.

Enterprise leaders report surprisingly low utilization rates across their AI projects, even as teams compete for computing resources. As it turns out, it’s⁣ actually a challenge that AWS itself faced.

“Internally, we had this kind⁢ of problem as we ‌were​ scaling up more than a year ago, and we built a system that takes into account the consumption⁢ needs of ⁢these accelerators,” Sivasubramanian ​told ‌NeuralNation. ‍“I ‍talked to many of our customers, CIOs and ⁣CEOs, they said we want exactly that; we want it as ‍part of Sagemaker and that’s what we are ⁣launching.”

Swami said that once the system was deployed AWS’ AI accelerator utilization ⁤went‍ through⁢ the roof with utilization rates rising over⁤ 90%

How HyperPod Task Governance works

The SageMaker ⁤Hyperpod technology was first⁣ announced⁢ at the re:invent 2023⁤ conference.

SageMaker ⁣HyperPod is built to handle the complexity of training large models with billions or tens of billions of parameters, which requires managing large clusters of ‌machine learning accelerators.

HyperPod Task Governance adds a new layer of control to SageMaker Hyperpod by introducing intelligent resource allocation ‌across different AI workloads.

The system⁣ recognizes that different AI tasks have varying demand patterns throughout the day. ⁢For instance, inference workloads ⁣typically‍ peak ⁣during business hours when applications see the most use, while training and experimentation can be scheduled during off-peak hours.

The system⁤ provides⁤ enterprises with real-time insights into ⁢project utilization, ⁢team resource consumption,⁢ and ​compute needs. It enables organizations to effectively load balance their GPU ​resources‌ across different‍ teams and projects, ensuring⁤ that expensive AI infrastructure never⁤ sits ‌idle.

AWS wants to make⁢ sure enterprises don’t leave money on the table

Sivasubramanian highlighted the critical importance of AI cost management during his keynote ‍address.

As an example,‍ he said that if an ⁢organization has allocated a ⁣thousand AI accelerators deployed not all are utilized consistently over a 24 hour period. During the day, they⁣ are heavily used for inference, but at night, a large portion⁤ of these costly resources are sitting idle when the inference demand might be ⁤very low.

“We live ⁤in a world where compute resources‍ are finite and expensive and it can be difficult to maximize utilization and efficiently allocate resources, which⁣ is typically done through spreadsheets and calendars,” he said. ” Now, without a strategic approach to⁤ resource allocation, ‌you’re not only ⁤missing opportunities, but you’re also leaving money on ⁢the​ table.”

⁢ rnrn