How Nvidia created the chip powering the generative AI boom
In 2022, US chipmaker Nvidia released the H100, one of the most powerful processors it had ever built — and one of its most expensive, costing about $40,000 each. The launch seemed badly timed, just as businesses sought to cut spending amid rampant inflation.
Then in November, ChatGPT was launched.
“We went from a pretty tough year last year to an overnight turnround,” said Jensen Huang, Nvidia’s chief executive. OpenAI’s hit chatbot was an “aha moment”, he said. “It created instant demand.”
ChatGPT’s sudden popularity has triggered an arms race among the world’s leading tech companies and start-ups that are rushing to obtain the H100, which Huang describes as “the world’s first computer [chip] designed for generative AI”— artificial intelligence systems that can quickly create humanlike text, images and content.
The value of having the right product at the right time became apparent this week. Nvidia announced on Wednesday that its sales for the three months ending in July would be $11bn, more than 50 per cent ahead of Wall Street’s previous estimates, driven by a revival in data centre spending by Big Tech and demand for its AI chips.
Investors’ response to the forecast added $184bn to Nvidia’s market capitalisation in a single day on Thursday, taking what was already the world’s most valuable chip company close to a $1tn valuation.
Nvidia is an early winner from the astronomical rise of generative AI, a technology that threatens to reshape industries, produce huge productivity gains and displace millions of jobs.
That technological leap is set to be accelerated by the H100, which is based on a new Nvidia chip architecture dubbed “Hopper” — named after the American programming pioneer Grace Hopper — and has suddenly became the hottest commodity in Silicon Valley.
“This whole thing took off just as we’re going into production on Hopper,” said Huang, adding that manufacturing at scale began just a few weeks before ChatGPT debuted.
Huang’s confidence on continued gains stems in part from being able to work with chip manufacturer TSMC to scale up H100 production to satisfy exploding demand from cloud providers such as Microsoft, Amazon and Google, internet groups such as Meta and corporate customers.
“This is among the most scarce engineering resources on the planet,” said Brannin McBee, chief strategy officer and founder of CoreWeave, an AI-focused cloud infrastructure start-up that was one of the first to receive H100 shipments earlier this year.
Some customers have waited up to six months to get hold of the thousands of H100 chips that they want to train their vast data models. AI start-ups had expressed concerns that H100s would be in short supply at just the moment demand was taking off.
Elon Musk, who has bought thousands of Nvidia chips for his new AI start-up X.ai, said at a Wall Street Journal event this week that at present the GPUs (graphics processing units) “are considerably harder to get than drugs”, joking that was “not really a high bar in San Francisco”.
“The cost of compute has gotten astronomical,” added Musk. “The minimum ante has got to be $250mn of server hardware [to build generative AI systems].”
The H100 is proving particularly popular with Big Tech companies such as Microsoft and Amazon, who are building entire data centres centred on AI workloads, and generative-AI start-ups such as OpenAI, Anthropic, Stability AI and Inflection AI because it promises higher performance that can accelerate product launches or reduce training costs over time.
“In terms of getting access, yes this is what ramping a new architecture GPU feels like,” said Ian Buck, head of Nvidia’s hyperscale and high-performance computing business, who has the daunting task of increasing supply of H100 to meet demand. “It’s happening at hyper scale,” he added, with some big customers looking for tens of thousands of GPUs.
The unusually large chip, an “accelerator” designed to work in data centres, has 80bn transistors, five times as many as the processors that power the latest iPhones. While it is twice as expensive as its predecessor, the A100 released in 2020, early adopters say the H100 boasts at least three times better performance.
“The H100 solves the scalability question that has been plaguing [AI] model creators,” said Emad Mostaque, co-founder and chief executive of Stability AI, one of the companies behind the Stable Diffusion image generation service. “This is important as it lets us all train bigger models faster as this moves from a research to an engineering problem.”
While the timing of the H100’s launch was ideal, Nvidia’s breakthrough in AI can be traced back almost two decades to an innovation in software rather than silicon.
Its Cuda software, created in 2006, allows GPUs to be repurposed as accelerators to other kinds of workloads beyond graphics. Then in around 2012, Buck explained, “AI found us.”
Researchers in Canada realised that GPUs were ideally suited to creating neural networks, a form of AI inspired by the way neurons interact in the human brain, which were then becoming a new focus for AI development. “It took almost 20 years to get to where we are today,” said Buck.
Nvidia now has more software engineers than hardware engineers to enable it to support the many different kinds of AI frameworks that have emerged in the subsequent years and make its chips more efficient at the statistical computation needed to train AI models.
Hopper was the first architecture optimised for “transformers”, the approach to AI that underpins OpenAI’s “generative pre-trained transformer” chatbot. Nvidia’s close work with AI researchers allowed it to spot the emergence of the transformer in 2017 and start tuning its software accordingly.
“Nvidia arguably saw the future before everyone else with their pivot into making GPUs programmable,” said Nathan Benaich, general partner at Air Street Capital, an investor in AI start-ups. “It spotted an opportunity and bet big and consistently outpaced its competitors.”
Benaich estimates that Nvidia has a two year lead over its rivals but adds: “Its position is far from unassailable on both the hardware and software front.”
Stability AI’s Mostaque agrees. “Next-generation chips from Google, Intel and others are catching up [and] even Cuda becomes less of a moat as software is standardised.”
To some in the AI industry, Wall Street’s enthusiasm this week looks overly optimistic. Nevertheless “for the time being”, said Jay Goldberg, founder of chip consultancy D2D Advisory, “the AI market for semis looks set to remain a winner takes all market for Nvidia.”
Additional reporting by Madhumita Murgia