Microsoft announces powerful new chip for AI inference

Microsoft has announced the launch of its latest chip, the Maia 200. The company describes it as a silicon workhorse designed for scaling AI inference. This new chip follows the Maia 100 released in 2023 and has been technically outfitted to run powerful AI models at faster speeds and with more efficiency.

The Maia 200 comes equipped with over 100 billion transistors, delivering over 10 petaflops in 4-bit precision and approximately 5 petaflops of 8-bit performance. This represents a substantial increase over its predecessor.

Inference refers to the computing process of running a trained AI model, in contrast with the compute required to train it. As AI companies mature, inference costs have become an increasingly important part of their overall operating expenses, leading to renewed interest in optimization.

Microsoft hopes the Maia 200 can be part of that optimization, making AI businesses run with less disruption and lower power use. The company stated that one Maia 200 node can effortlessly run today’s largest models, with plenty of headroom for even bigger models in the future.

Microsoft’s new chip is part of a growing trend of tech giants turning to self-designed chips to lessen their dependence on NVIDIA, whose cutting-edge GPUs have become pivotal to AI success. Google, for instance, has its Tensor Processing Units, or TPUs. Amazon has its own AI accelerator chip, Trainium, which just launched its latest version, the Trainium3, in December. In each case, these custom chips can be used to offload some compute from NVIDIA GPUs, lessening overall hardware costs.

With Maia, Microsoft is positioning itself to compete with those alternatives. The company noted that Maia delivers 3x the FP4 performance of third-generation Amazon Trainium chips and FP8 performance above Google’s seventh-generation TPU.

Microsoft says the Maia is already hard at work fueling the company’s AI models from its Superintelligence team and has been supporting the operations of its Copilot chatbot. As of the announcement, the company has invited a variety of parties, including developers, academics, and frontier AI labs, to use its Maia 200 software development kit in their workloads.