Meta’s compute grab continues with agreement to deploy tens of millions of AWS Graviton cores

Controlling the system, not just scale

Graphics processing units (GPUs) are essential for large language model (LLM) training, but agentic AI requires a whole new workload capability. CPUs like Graviton5 are rising to this challenge, supporting intensive workloads like real-time reasoning, multi-step tasks, frontier model training, code generation, and deep research.

AWS says Graviton5 has the ability to handle “billions of interactions” and to coordinate complex, multi-stage agentic tasks. It is built on the AWS Nitro System to support high performance, availability, and security.

“This is really about control of the AI system, not just scale,” said Kimball. As AI evolves toward persistent, agentic workloads, the role of the CPU becomes “quite meaningful;” it serves as the control plane, handling orchestration, managing memory, scheduling, and other intensive tasks across accelerators.

“This is especially true in agentic environments, where the workloads will be less linear and more stateful,” he pointed out. So, ensuring a supply of these resources just makes sense.

Reflecting Meta’s diversified approach to hardware

The agreement builds on Meta’s long-standing partnership with AWS, but also reflects what the company calls its “diversified approach” to infrastructure. “No single chip architecture can efficiently serve every workload,” the company emphasized.

Proving the point, Meta recently announced four new generations of its MTIA training and inference accelerator chip and signed a massive deal with AMD to tap into 6GW worth of CPUs and AI accelerators. It also entered into a multi-year partnership with Nvidia to access millions of Blackwell and Rubin GPUs and to integrate Nvidia Spectrum-X Ethernet switches into its platform, and was also one of Arm’s first major CPU customers.

In the wake of all this, Nabeel Sherif, a principal advisory director at Info-Tech Research Group, posed the burning question: “What are they going to do with all this capacity?”

Primarily it will support Meta’s internal experimentation and innovation, he said, but it also lays the groundwork and provides the capacity for Meta to offer its own agentic AI services, for instance, its Llama AI model as an API, to the market.

“What those [services] will look like and what platforms and tools they’ll use, as well as what guardrails they’ll provide to users, is still unclear, but it’s going to be interesting to see it develop,” said Sherif.

The expanded capacity will enable a diversity of use cases and experimentation across various architectures and platforms, he said. Meta will have many options, and access to supply in an environment currently characterized not only by a wide variety of new CPU approaches, but by significant supply chain constraints. The AWS deal should be viewed as a complement to its partnerships and investments in other platforms like ARM, Nvidia, and AMD.

Kimball agreed that the move is “most definitely additive,” not a replacement or substitution. Meta isn’t moving off GPUs or accelerators, it’s building around them. “This is about assembling a heterogeneous system, not picking a single winner,” he said. “In fact, I think for most, heterogeneity is critical to long term success.”

Nvidia still dominates training and a lot of inference, while AMD is becoming “more and more relevant at scale,” Kimball noted. Arm, meanwhile, whether through CPU, custom silicon or other efforts, gives Meta architectural control, and Graviton5 fits into that mix as a “cost- and efficiency-optimized general-purpose compute layer.”

A question of strategy

The more interesting question is around strategy: Does this signal Meta is becoming a compute provider? Kimball doesn’t think so, noting that it’s likely the company isn’t looking to directly compete with hyperscalers as a general-purpose cloud. “This is more about vertical integration of their own AI stack,” he said.

The move gives them the ability to support internal workloads more efficiently, as well as providing the infrastructure foundation to expose more of that capability externally, whether through APIs, partnerships, or other means, he said.

And there’s a cost dynamic here, too, Kimball noted. As inference becomes persistent, especially with agentic systems, economics shift away from peak floating-point operations per second (FLOPS) (a measure of compute performance) and toward sustained efficiency and total cost of ownership (TCO).

CPUs like Graviton5 are well positioned for the parts of that workload that don’t require accelerators, but still need to run continuously. “At Meta’s scale, even small efficiency gains per workload compound quickly,” Kimball pointed out.

For developers and enterprise IT, the signal is pretty clear, he noted: The AI stack is getting more heterogeneous, not less so. Enterprises are going to see tighter coupling between CPUs, GPUs, and specialized accelerators, with workloads increasingly split across them based on behavior (prefill versus decode, stateless versus stateful, burst versus persistent).

“The implication is that infrastructure decisions have to become more workload-aware,” said Kimball. “It’s less about ‘which cloud?’ and more about ‘where does this specific part of the application run most efficiently?’”

This article originally appeared on NetworkWorld.

Source link

Meta’s compute grab continues with agreement to deploy tens of millions of AWS Graviton cores

Controlling the system, not just scale

Reflecting Meta’s diversified approach to hardware

A question of strategy

10 Ways To Save Serious Money On Your Next PC

3 Disadvantages Of Wirelessly Charging Your Phone

The Easiest Way To Test Your Windows Laptop’s Battery Health

Can You Mix And Match Different Mesh Wi-Fi Brands?

Leave a reply Cancel reply

Meta’s compute grab continues with agreement to deploy tens of millions of AWS Graviton cores

Controlling the system, not just scale

Reflecting Meta’s diversified approach to hardware

A question of strategy

10 Ways To Save Serious Money On Your Next PC

3 Disadvantages Of Wirelessly Charging Your Phone

The Easiest Way To Test Your Windows Laptop’s Battery Health

Can You Mix And Match Different Mesh Wi-Fi Brands?

Google patched 1,072 vulnerabilities in Chrome with AI; one flaw was 13 years old – Computerworld

12 ways to boost your browser – Computerworld

Everything you need to know – Computerworld

Leave a reply Cancel reply