The Technology Powering AI: Understanding Intel’s NPUs
In a recent conversation, Jenny Darmody spoke with Michael Langan from Intel to explore the essential work carried out by the neural processing unit (NPU) team and the evolving architecture of these specialized hardware components.
As the excitement surrounding artificial intelligence (AI) continues to grow, it’s difficult to avoid news related to the technology. Whether it's announcements about new large language models (LLMs) or funding updates for AI-focused startups, the buzz is everywhere.
This surge in interest has also led to a phenomenon known as AI washing, where companies exaggerate or misrepresent AI capabilities to attract customers and investors. This has prompted regulatory bodies, including the US Federal Trade Commission, to take action against companies using misleading AI practices.
Amidst the focus on software and applications, it's important to remember that robust hardware is needed to support the computational demands of AI. This is where neural processing units enter the picture. Also referred to as AI accelerators, NPUs are designed to enhance and speed up the computations necessary for AI models to operate effectively.
The NPU Team at Intel
To gain insight into the NPU development process, Jenny Darmody caught up with Michael Langan during the annual Midas conference in November 2024. Langan has been with Intel for 14 years and currently leads the NPU intellectual property (IP) team based in Ireland. He emphasized that this IP is crucial for all client devices, including laptops and desktops, representing a market worth approximately $30 billion in annual revenue, with AI being a pivotal aspect of it.
The global NPU IP team at Intel consists of about 500 individuals, and Langan notes that the roots of this IP can be traced back to Movidius, an Irish startup acquired by Intel in 2016. He explained that the development of NPUs gained momentum around 2012 with the advent of convolutional neural networks, which are widely used for image recognition tasks.
A significant turning point occurred in 2017 when Google published a paper titled 'Attention is All You Need', introducing transformer architecture. Langan stated that this advancement dramatically transformed the landscape, laying the foundation for generative models like ChatGPT and other LLMs. He added, "The design we do is to accelerate workloads like that," highlighting the team's focus on evolving architectural needs.
At Intel, Langan mentioned that their work encompasses every aspect of NPU development. He described their hardware team as working on Verilog RTL design alongside a large verification team. Their designs can be utilized across various applications, leveraging both TSMC and Intel process nodes. Furthermore, there is a substantial software and compiler team involved, as optimizing AI compilers is vital to their technology strategy. In Ireland, Langan's team includes around 250-300 members, contributing significantly to the broader Intel effort.
Challenges in the Evolving Landscape
One of the biggest challenges faced by the NPU team is the rapid pace of technological change, particularly in customer demands. Langan shared how the dynamic has shifted from Intel's team promoting new features to customers approaching them with requests for new applications and tailored features.
Another notable challenge is the talent shortage in the market, given the specialized skills required for NPU development. Langan indicated that Intel is continuously on the lookout for individuals skilled in deep learning hardware and software, AI compilers, and related fields.
To address this skills gap, Intel implemented an internship program with universities over a decade ago, successfully building a strong pipeline of talent. Langan praised the high caliber of candidates sourced from these academic institutions, affirming the recognition of Ireland's talent on a global scale.
Looking ahead, while Langan stays concentrated on optimizing AI models, he acknowledged the growing curiosity around what the next major architecture may be following transformers. He pointed out that numerous papers are being released weekly, with some suggesting new architectures could replace the transformer model.
Among these emerging models are Mamba and Hymba, both designed to improve training efficiency, reduce power consumption, and enhance performance. Langan confirmed that the NPU team is closely monitoring these developments to incorporate innovations into their hardware designs.
AI, NPU, Intel