-6.3 C
New York
Wednesday, January 22, 2025

The bitter lesson for generative AI adoption



The rapid pace of innovation and the proliferation of new models have raised concerns about technology lock-in. Lock-in occurs when businesses become overly reliant on a specific model with bespoke scaffolding that limits their ability to adapt to innovations. Upon its release, GPT-4 was the same cost as GPT-3 despite being a superior model with much higher performance. Since the GPT-4 release in March 2023, OpenAI prices have fallen another six times for input data and four times for output data with GPT-4o, released May 13, 2024. Of course, an analysis of this sort assumes that generation is sold at cost or a fixed profit, which is probably not true, and significant capital injections and negative margins for capturing market share have likely subsidized some of this. However, we doubt these levers explain all the improvement gains and price reductions. Even Gemini 1.5 Flash, released May 24, 2024, offers performance near GPT-4, costing about 85 times less for input data and 57 times less for output data than the original GPT-4. Although eliminating technology lock-in may not be possible, businesses can reduce their grip on technology adoption by using commercial models in the short run.

Avoiding lock-in risks

In some respects, the bitter lesson is part of this more considerable discussion about lock-in risks. We expect scaling to continue, at least for another couple of interactions. Unless you have a particular use case with obvious commercial potential, or operate within a high-risk and highly regulated industry, adopting the technology before the full scaling potential is determined and exhausted may be hasty.

Ultimately, training a language model or adopting an open-source model is like swapping a leash for a ball and chain. Either way, you’re not walking away without leaving some skin in the game. You may need to train or tune a model in a narrow domain with specialized language and tail knowledge. However, training language models involves substantial time, computational resources, and financial investment. This increases the risk for any strategy. Training a language model can cost hundreds of thousands to millions of dollars, depending on the model’s size and the amount of training data. The economic burden is exacerbated by the nonlinear scaling laws of model training, in which gains in performance may require exponentially greater compute resources—highlighting the uncertainty and risk involved in such endeavors. Bloomberg’s strategy of including a margin of error of 30 percent of their computing budget underscores the unpredictable nature of training.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles