During an event in San Francisco on Tuesday, Arm introduced its inaugural chip, stepping away from its traditional blueprint approach. The new 136-core CPU, named the AGI CPU, is set to be deployed on a large scale later this year by its prominent customer, Meta.
This British chip designer’s latest offering marks a pivotal shift in its business strategy. Historically, Arm has licensed core IP or instruction set architectures for datacenter products. However, with the AGI CPU, it is venturing into producing its own Arm-branded processors specifically focused on facilitating agentic AI capabilities.
While the AGI CPU features Arm Neoverse V3 cores, it will not execute AI models directly – that function typically falls to GPUs or advanced AI ASICs. Instead, Arm envisions this CPU as the backbone powering AI agents, positioning it in direct competition with Nvidia’s standalone Vera CPUs showcased at GTC last week.
According to Mohamed Awad, Arm's Executive Vice President of cloud AI, the CPU is pivotal for achieving artificial general intelligence (AGI).
Despite the spotlight on GPUs in recent years, the emergence of agentic systems, such as OpenClaw, has rekindled the need for general-purpose computing. These systems require CPU cores and memory for coding, task automation, and reinforcement learning vital for training future models. Arm predicts a dramatic rise in CPU demand due to this trend, hence its strategic development of the AGI CPU.
The AGI CPU operates at 300 watts and boasts 136 Neoverse V3 cores, with clock speeds reaching 3.7 GHz (3.2 GHz base). This processor is manufactured using TSMC’s advanced 3 nm technology and is designed with 2 MB of L2 cache per core, providing a total of 128 MB of shared system-level cache.
Awad emphasized the focus on creating an efficient design, intentionally omitting non-essential accelerators to maximize die space for targeted performance. "Legacy CPUs often included unnecessary support for outdated applications. We wanted our design to be solely focused on what this device needs,” he remarked.
In contrast to Nvidia’s Vera architecture, which employs simultaneous multithreading, Arm’s approach provides one thread per core, guiding more predictable performance scaling. The CPU includes 12 channels of DDR5 memory, likely distributed as six channels per die, supporting speeds up to 8800 MT/s. With an aggregate bandwidth of 825 GB/s, this translates to 6 GB/s available for each core.
In a bid to reduce latency, Arm integrates both memory and I/O operations within the same die as the compute cores. Each socket will be recognized by the operating system as two separate NUMA domains. For connectivity, the processor is outfitted with 96 lanes of PCIe 6.0, along with support for CXL 3.0.
Meta, already utilizing Nvidia’s Arm-based Grace CPUs and planning to adopt Vera chips, is among Arm's first significant CPU customers. The company has also validated two different Open Compute Project (OCP) rack designs. The first is a 36 kW air-cooled rack that accommodates 30 compute blades, summing up to 8,160 cores. Additionally, Arm has confirmed an even denser 200 kW liquid-cooled rack featuring 42 eight-node servers, resulting in an impressive 45,696 cores—more than double Nvidia's Vera ETL256 CPUs at 22,528 cores.
Arm’s customer list extends beyond Meta, with notable early adopters including OpenAI, SAP, Cerebras, Cloudflare, F5, SK Telecom, and Rebellions. The chip is envisioned not just for AI agents but also as a head node for tailored accelerators, or as a conventional CPU for networking and storage solutions. OEM partners, like Lenovo, are actively developing 19-inch systems that incorporate this new chip.
Historically, enterprise clients have faced limited options for Arm datacenter silicon, with Ampere Computing being the sole significant non-cloud competitor. Set to launch later this year, Arm's AGI CPU promises a transformative potential in the tech landscape, though whether it will usher in the era of The Singularity remains to be seen.