By Katie Paul, Stephen Nellis and Max A. Cherney
(Reuters) – Facebook owner Meta Platforms plans to deploy into its data centers this year a new version of a custom chip aimed at supporting its artificial intelligence (AI) push, according to an internal company document seen by Reuters on Thursday.
The chip, a second generation of an in-house silicon line Meta announced last year, could help to reduce Meta’s dependence on the Nvidia chips that dominate the market and control the spiraling costs associated with running AI workloads as it races to launch AI products.
The world’s biggest social media company has been scrambling to boost its computing capacity for the power-hungry generative AI products it is pushing into apps Facebook, Instagram and WhatsApp and hardware devices like its Ray-Ban smartglasses, spending billions of dollars to amass arsenals of specialized chips and reconfigure data centers to accommodate them.
At the scale at which Meta operates, a successful deployment of its own chip could potentially shave off hundreds of millions of dollars in annual energy costs and billions in chip purchasing costs, according to Dylan Patel, founder of the silicon research group SemiAnalysis.
The chips, infrastructure and energy required to run AI applications has become a giant sinkhole of investment for tech companies, to some degree offsetting gains made in the rush of excitement around the technology.
A Meta spokesperson confirmed the plan to put the updated chip into production in 2024, saying it would work in coordination with the hundreds of thousands of off-the-shelf graphics processing units (GPUs) – the go-to chips for AI – the company was buying.
“We see our internally developed accelerators to be highly complementary to commercially available GPUs in delivering the optimal mix of performance and efficiency on Meta-specific workloads,” the spokesperson said in a statement.
Meta CEO Mark Zuckerberg last month said the company planned to have by the end of the year roughly 350,000 flagship “H100” processors from Nvidia, which produces the most sought-after GPUs used for AI. Combined with other suppliers, Meta would accumulate the equivalent compute capacity of 600,000 H100s in total, he said.
The deployment of its own chip as part of that plan is a positive turn for Meta’s in-house AI silicon project, after a decision by executives in 2022 to pull the plug on the chip’s first iteration.
The company instead opted to buy billions of dollars worth of Nvidia’s GPUs, which have a near monopoly on an AI process called training that involves feeding enormous data sets into models to teach them how to perform tasks.
The new chip, referred to internally as “Artemis,” like its predecessor can perform only a process known as inference in which the models are called on to use their algorithms to make ranking judgments and generate responses to user prompts.
Reuters last year reported that Meta is also working on a more ambitious chip that, like GPUs, would be capable of performing both training and inference.
The Menlo Park, California-based company shared details about the first generation of its Meta Training and Inference Accelerator (MTIA) program last year. The announcement portrayed that version of the chip as a learning opportunity.
Despite those early stumbles, an inference chip could be considerably more efficient at crunching Meta’s recommendation models than the energy thirsty Nvidia processors, according to Patel.
“There is a lot of money and power being spent that could be saved,” he said.
(Reporting by Katie Paul in New York and Stephen Nellis and Max A. Cherney in San Francisco; Editing by Kenneth Li and Mark Porter)