Most recently, for instance, Walmart announced that it is rolling-out a gen AI app to 50,000 non-store employees. As reported by Axios, the app combines data from Walmart with third-party large language models (LLM) and can help employees with a range of tasks, from speeding up the drafting process, to serving as a creative partner, to summarizing large documents and more.
Deployments such as this are helping to drive demand for graphical processing units (GPUs) needed to train powerful deep learning models. GPUs are specialized computing processors that execute programming instructions in parallel instead of sequentially — as do traditional central processing units (CPUs).
According to the Wall Street Journal, training these models “can cost companies billions of dollars, thanks to the large volumes of data they need to ingest and analyze.” This includes all deep learning and foundational LLMs from GPT-4 to LaMDA — which power the ChatGPT and Bard chatbot applications, respectively.
Riding the generative AI wave
The gen AI trend is providing powerful momentum for Nvidia, the dominant supplier of these GPUs: The company announced eye-popping earnings for their most recent quarter. At least for Nvidia, it is a time of exuberance, as it seems nearly everyone is trying to get ahold of their GPUs.
Erin Griffiths wrote in the New York Times that start-ups and investors are taking extraordinary measures to obtain these chips: “More than money, engineering talent, hype or even profits, tech companies this year are desperate for GPUs.”
In his Stratechery newsletter this week, Ben Thompson refers to this as “Nvidia on the Mountaintop.” Adding to the momentum, Google and Nvidia announced a partnership whereby Google’s cloud customers will have greater access to technology powered by Nvidia’s GPUs. All of this points to the current scarcity of these chips in the face of surging demand.
Does this current demand mark the peak moment for gen AI, or might it instead point to the beginning of the next wave of its development?
How generative tech is shaping the future of computing
Nvidia CEO Jensen Huang said on the company’s most recent earnings call that this demand marks the dawn of “accelerated computing.” He added that it would be wise for companies to “divert the capital investment from general purpose computing and focus it on generative AI and accelerated computing.”
General purpose computing is a reference to CPUs that have been designed for a broad range of tasks, from spreadsheets to relational databases to ERP. Nvidia is arguing that CPUs are now legacy infrastructure, and that developers should instead optimize their code for GPUs to perform tasks more efficiently than traditional CPUs.
GPUs can execute many calculations simultaneously, making them perfectly suited for tasks like machine learning (ML), where millions of calculations are performed in parallel. GPUs are also particularly adept at certain types of mathematical calculations — such as linear algebra and matrix manipulation tasks — that are fundamental to deep learning and gen AI.
GPUs offer little benefit for some types of software
However, other classes of software (including most existing business applications), are optimized to run on CPUs and would see little benefit from the parallel instruction execution of GPUs.
Thompson appears to hold a similar view: “My interpretation of Huang’s outlook is that all of these GPUs will be used for a lot of the same activities that are currently run on CPUs; that is certainly a bullish view for Nvidia, because it means the capacity overhang that may come from pursuing generative AI will be backfilled by current cloud computing workloads.”
He continued: “That noted, I’m skeptical: Humans — and companies — are lazy, and not only are CPU-based applications easier to develop, they are also mostly already built. I have a hard time seeing what companies are going to go through the time and effort to port things that already run on CPUs to GPUs.”
We’ve been through this before
Matt Assay of InfoWorld reminds us that we have seen this before. “When machine learning first arrived, data scientists applied it to everything, even when there were far simpler tools. As data scientist Noah Lorang once argued, ‘There is a very small subset of business problems that are best solved by machine learning; most of them just need good data and an understanding of what it means.’”
The point is, accelerated computing and GPUs are not the answer for every software need.
Nvidia had a great quarter, boosted by the current gold-rush to develop gen AI applications. The company is naturally ebullient as a result. However, as we have seen from the recent Gartner emerging technology hype cycle, gen AI is having a moment and is at the peak of inflated expectations.
According to Singularity University and XPRIZE founder Peter Diamandis, these expectations are about seeing future potential with few of the downsides. “At that moment, hype starts to build an unfounded excitement and inflated expectations.”
To this very point, we could soon reach the limits of the current gen AI boom. As venture capitalists Paul Kedrosky and Eric Norlin of SK Ventures wrote on their firm’s Substack: “Our view is that we are at the tail end of the first wave of large language model-based AI. That wave started in 2017, with the release of the [Google] transformers paper (‘Attention is All You Need’), and ends somewhere in the next year or two with the kinds of limits people are running up against.”
Those limitations include the “tendency to hallucinations, inadequate training data in narrow fields, sunsetted training corpora from years ago, or myriad other reasons.” They add: “Contrary to hyperbole, we are already at the tail end of the current wave of AI.”
To be clear, Kedrosky and Norlin are not arguing that gen AI is at a dead-end. Instead, they believe there needs to be substantial technological improvements to achieve anything better than “so-so automation” and limited productivity growth. The next wave, they argue, will include new models, more open source, and notably “ubiquitous/cheap GPUs” which — if correct — may not bode well for Nvidia, but would benefit those needing the technology.
As Fortune noted, Amazon has made clear its intentions to directly challenge Nvidia’s dominant position in chip manufacturing. They are not alone, as numerous startups are also vying for market share — as are chip stalwarts including AMD. Challenging a dominant incumbent is exceedingly difficult. In this case, at least, broadening sources for these chips and reducing prices of a scarce technology will be key to developing and disseminating the next wave of gen AI innovation.
The future for gen AI appears bright, despite hitting a peak of expectations existing limitations of the current generation of models and applications. The reasons behind this promise are likely several, but perhaps foremost is a generational shortage of workers across the economy that will continue to drive the need for greater automation.
Although AI and automation have historically been viewed as separate, this point of view is changing with the advent of gen AI. The technology is increasingly becoming a driver for automation and resulting productivity. Workflow company Zapier co-founder Mike Knoop referred to this phenomenon on a recent Eye on AI podcast when he said: “AI and automation are mode collapsing into the same thing.”
Certainly, McKinsey believes this. In a recent report they stated: “generative AI is poised to unleash the next wave of productivity.” They are hardly alone. For example, Goldman Sachs stated that gen AI could raise global GDP by 7%.
Whether or not we are at the zenith of the current gen AI, it is clearly an area that will continue to evolve and catalyze debates across business. While the challenges are significant, so are the opportunities — especially in a world hungry for innovation and efficiency. The race for GPU domination is but a snapshot in this unfolding narrative, a prologue to the future chapters of AI and computing.