DeekSeek and its implications for the future of AI

The Wing Team
Author
No items found.
The recent emergence of DeepSeek has sparked countless conversations about the future of AI, and the Wing team is no exception. Yesterday, the conversation in the office revolved around this topic dominating the zeitgeist. As our founding partner Peter Wagner started, “Let’s look down the road. What are the implications?”

The recent emergence of DeepSeek has sparked countless conversations about the future of AI, and the Wing team is no exception. As those of you who have visited our offices know, we have family-style lunch every day. Yesterday, the conversation focused entirely on this topic dominating the zeitgeist. As our founding partner Peter Wagner started, “Let’s look down the road. What are the implications?” Some of the conversation points were:

The shift in value: Moving up the stack 

The first clear implication is a move of value upstack. It’s good to be in applications. While Nvidia was down 17% and NASDAQ was down 3% on Monday, Salesforce and Workday were up 3%.

The AI industry is experiencing shifts in where value is being created and captured. Traditionally, focus and investment were directed toward the infrastructure layer, such as the development of specialized hardware (e.g., GPUs) and foundational models. Market dynamics are changing, with applications and software layers gaining prominence. Investors may see the potential for higher margins and more sustainable value in the application layer, where AI can be directly integrated into business processes and user-facing products. That said, infrastructure providers like Nvidia or CoreWeave might be affected, but the reduced costs could lead to more AI development overall, potentially broadening their customer base.

The inevitability of lower-cost architectures

It was inevitable that a lower cost architecture would emerge. We had been asking for the past two years, “Why does it cost hundreds of millions of dollars to train models, and should it?” Training models like GPT-3 required hundreds of millions of dollars, raising questions about the sustainability and accessibility of such models. DeepSeek's emergence with its R1 model represents a significant step toward addressing these concerns. By potentially reducing the costs associated with model training, DeepSeek is challenging the (albeit nascent) standard, and paving the way for a more democratized AI landscape. Perhaps this shift will make AI technology more accessible to smaller companies and researchers, fostering innovation across a broader range of participants.

The innovation behind DeepSeek's R1 model

DeepSeek’s R1 model is a compelling surprise and innovation, and there appear to be significant algorithmic and architectural insights. It has garnered attention for its unique approach to AI development; the model's performance and efficiency suggest DeepSeek may have made advancements in both algorithmic design and architectural choices. Will these innovations set a new benchmark for the industry? The surprise element of R1's capabilities highlights the rapid pace of progress in AI research, and the potential for new players to disrupt the market – in quick order.

Additionally, the R1 model is likely not an entirely new development, but rather a set of techniques allowing a "fast follower" to rapidly deliver quality comparable to leading models at a much lower cost. One of these techniques is probably "distillation," which involves training a model on query/response pairs from a prior model. This method is frowned upon by AI researchers and may face attempts to block it in the future. While these techniques won't help create a "best in the world" model, they enable quick and cheap catching up to leading models.

Reverse engineering and the response from closed models

R1 was enabled through extensive reverse engineering of some closed models. Perhaps, the closed models respond by closing the loopholes that permitted those queries, and petitioning for IP infringement protection—to which the media companies may reply with a “cry me a river” perspective. This approach raises important questions about intellectual property, competition, and the future of AI development. It may also signal a shift to a broader view that the AI landscape should be more open and collaborative.

The impact on capital and investment

Perhaps the capital environment for the frontier model companies softens as it’s no longer clear they can charge a premium for higher-performing models and generate a return on substantial upfront training costs. If the cost of developing high-performing models decreases, the business models of companies that have relied on premium pricing for their models may be challenged. There are questions about who will fund frontier model development if it can be quickly and cheaply copied. This could lead to a softening of the capital environment, making it more difficult for AI startups to secure large funding rounds. Companies like OpenAI, Anthropic, and Mistral, which have raised substantial amounts of capital to develop their models, may find that the market for their products is becoming more competitive. This could lead to a more cautious approach from investors, who may be less willing to provide the large upfront investments required for training complex models. Overall it could significantly lower AI costs for the entire ecosystem, benefiting application developers and providers of other AI stack layers by improving the economics of AI-based systems.

DeepSeek's R1 model represents a significant development in the AI landscape, with implications that extend beyond the technical advancements of the model itself. The shift in value creation, the emergence of lower-cost architectures, and the potential impact on capital and investment all point to a future where AI is more accessible, competitive, and potentially more collaborative.

Let us know your thoughts and reactions. Peter is at @peter_wagner.

Read Full Article
Wing Logo in blue, all lower case letters.
Thanks for signing up!
Form error, try again.