David Luan: Why Nvidia Will Enter the Model Space & Models Will Enter the Chip Space | E1169

25 Jun 2024 (9 months ago)

Intro (0s)

OpenAI realized that the next phase of AI after Transformer models would involve solving major unsolved scientific problems.
There are still ways to improve model performance, which will require a significant amount of compute.
David Luan is excited to be on the podcast and has watched previous episodes.
David Luan has worked at several incredible companies, which have served as training grounds for him.

Lessons from Google Brain & Their Influence on Building Adept (1m3s)

Google Brain was dominant in AI research during the 2012-2018 period.
Google Brain's success was due to its ability to attract top talent and provide an environment that encouraged pure bottom-up basic research.
Pure bottom-up basic research involves hiring brilliant scientists, giving them the freedom to work on open technical problems in AI, and allowing them to collaborate and share ideas.
This approach led to significant breakthroughs, such as the invention of the Transformer model.
The Transformer model, invented at Google Brain, was a general-purpose model that could be applied to a wide range of machine learning tasks.
Prior to the Transformer, different models were required for different tasks, such as image understanding, text generation, and playing games.
The Transformer's versatility made it the foundation of modern AI and reduced the need for low-level breakthroughs in modeling.
Researchers could now focus on using the Transformer to solve complex real-world problems.

Why It Took 6 Years for ChatGPT to Emerge After Transformers (5m6s)

The success of the show is attributed to the willingness to ask simple questions.
ChatGPT is seen as a consumer breakthrough after the introduction of Transformers in 2017.
The gap between the Transformer breakthrough and consumer adoption of ChatGPT is questioned.
Language models have been gradually improving since the introduction of Transformers in 2017.
GPT-2, released in 2019, demonstrated impressive generalist capabilities.
Two key factors contributed to the viral success of ChatGPT:
- Increasing intelligence of language models reaching a compelling level.
- User-friendly packaging that allowed consumers to interact with the models.
The GPT-3 API was available over a year before ChatGPT, but it lacked the consumer-friendly interface, hindering its widespread adoption.

Takeaways from OpenAI (6m49s)

OpenAI realized that the next phase of AI after Transformer was not about research paper writing but about solving major unsolved scientific problems.
Instead of loose collections of researchers, OpenAI focused on building a culture around large groups of scientists working on specific real-world problems.
This approach is different from the traditional academic curiosity-driven research and is similar to the Apollo project, where the goal is to solve a specific problem rather than just conducting research.
Historically, scaling up language models has shown diminishing returns for every incremental GPU added.
However, every doubling of GPUs has resulted in very predictable and consistent returns.
To make a base language model predictably and consistently smarter, the amount of compute needs to be doubled.

The Key Bottleneck in AI Model Performance (9m57s)

Model scaling involves enlarging the model and having it gather data to enhance its intelligence.
The focus is shifting from base model scaling to more efficient methods like simulation, synthetic data, and reinforcement learning.
Models trained on data have limitations in discovering new knowledge or solving problems not in the training set.
Chatbots and agents are diverging into distinct technologies with varying use cases and requirements.
Hallucinations in chatbots and image generators can be beneficial for problem-solving, novelty, and creativity.
Agents performing tasks like handling taxes or shipping containers should not hallucinate to avoid errors or problems.

Understanding Minimum Viable Capability Levels & Model Scale (16m6s)

AI systems have become less predictable as they grow larger and more complex.
The capabilities of a model cannot be fully predicted in advance, even with estimates.
Minimum viable capabilities are a function of model scale, meaning certain capabilities emerge at specific model sizes.
It is difficult to predict the exact resource investment needed to achieve specific capabilities.
Reasoning is a challenging problem in AI that requires new research.
Pure model scaling alone does not provide solutions to reasoning.
Reasoning involves composing existing thoughts to discover new ones, which cannot be achieved solely through internet data regurgitation.
Solving reasoning requires providing models with access to theorem-proving environments, similar to how human mathematicians approach problem-solving.
The general capability of reasoning needs to be solved at the model provider level to improve the model's ability to reason.

The Future of the Foundational Model Layer (20m17s)

The high costs involved in reasoning skills limit the number of long-term steady-state LLM providers to a maximum of 5-7.
Solving reasoning involves training a base model, granting it access to various environments to solve complex problems, and incorporating human input.
Memory in AI presents challenges in both short-term working memory and long-term memory, with progress made in the former but not the latter.
Application developers should integrate long-term memory about user preferences as part of a larger system.
LLMs are components of a larger software system, and the winning LLM providers will be those that existentially need to win, such as tier-one cloud providers.
Nvidia should expand its chip business by offering LLM services, while major LLM providers are developing in-house chips for better margins.
The interface of the LLM provides significant leverage downstream, leading to expected vertical integration pressure between model builders and chip makers.
Chip makers need to own something at the model layer to avoid commoditization.
Apple has an advantage in running smart models for free at the edge but may not be enough to compete with the smartest models.
Apple will excel in private, fine-tuned models that don't require extensive reasoning and will run on the edge.
OpenAI's GPT-40 represents a scientific advancement towards universal models that can process various inputs and generate any output.
Apple's partnership with OpenAI suggests recognition of OpenAI's progress and hints at a future where foundational models become commoditized.
Tier-one cloud providers will invest heavily in foundational models, while independent companies selling models to developers must either become first-party efforts of big clouds or build a substantial economic flywheel before commoditization.
Adept's focus on selling end-user-facing agents to enterprises differentiates its business model from companies selling models to developers.
OpenAI's Chat GPT provides a significant advantage in building an economic flywheel for independent foundational model companies.

Adept’s Focus for Vertical Integration for AI Agents (33m26s)

Adept is focused on building an AI agent that can handle arbitrary work tasks.
Adept is not trying to train foundation models to sell to others, but rather building a vertically integrated stack from the user interface to the foundation modeling layer.
Adept believes that vertical integration of model with use case is the only way to solve the generalization problem and handle the variability of agent requirements across industries.
Adept's advantage is that it is building a system that can handle any workflow or task in any enterprise, rather than focusing on a particular narrow problem.
Every enterprise workflow is an edge case, and controlling the entire stack is necessary to handle the variability and complexity of these workflows.

The Distinction Between RPA & Agents (35m53s)

RPA is suitable for repetitive tasks, while agents require constant thinking and planning to solve goals.
Agents disrupt the business model of RPA players by demanding self-serve capabilities and addressing challenging use cases.
The pricing model may shift from per-seat to consumption-based in some areas, but valuable knowledge work will not be priced that way due to AI agents empowering people to explore new possibilities and enhance their productivity.
The objective is to develop AI systems that act as co-pilots or teammates.
Unlike traditional workers, these AI systems should not charge based on their workload but rather on their ability to augment human capabilities and create new opportunities.

The Co-pilot Approach: Incumbent Strategy or Innovation Catalyst (40m24s)

Co-pilots can be a great incumbent strategy for companies to morph their existing software business model into AI.
AI is not going to take all jobs, but it will augment human capabilities and make them more efficient.
Co-pilot style approach is necessary for humans to effectively use AI systems.
Collapsing the talent stack, where individuals have multiple skill sets and can simultaneously fulfill different roles, will make projects and teams more effective.
Humans at work will become more like generalists, with larger scope over various functions, while supervising specialized AI co-pilots.

Enterprise AI Adoption Budgets: Experimental vs. Core (42m46s)

Most Enterprise AI adoption is still in the experimental budget phase.
Many Enterprises still have a lot of on-prem infrastructure and workflows running on mainframes.
Cloud technology, which is considered mature from a startup perspective, still doesn't have full adoption in Enterprises.
There is a concern that AI adoption may follow a similar pattern to autonomous driving, with a period of plateauing progress.
Unlike self-driving cars, where there was an initial breakthrough followed by incremental improvements, AI models are constantly improving with new scientific advancements.
Breakthroughs like Universal multim lt40 are visible and continue to enhance the capabilities of AI models.
AI models are already deployed and don't need to reach a specific level of reliability before deployment.
AI services companies, which help in the implementation of AI in large Enterprises, are projected to be bigger than the model providers themselves in terms of revenue.
This projection has been proven true, with some people who criticized the prediction later acknowledging their mistake when the revenue numbers were revealed.

AI Services Providers vs. Actual Providers (46m53s)

AI Services providers may not be the biggest players in the long term.
Companies that turn use cases with product-market fit into repeatable products will likely be the real economic winners.
Many AI services will eventually be turned into generalizable products.
There is a concern that regulations on data usage and collection could hinder the progression of AI models.
Regulatory capture is a concern, with lawmakers potentially favoring established companies and making it harder for new entrants to compete.
This could lead to a concentration of power in the AI industry and stifle innovation.

Open vs. Closed AI Systems for Crucial Decision Making (49m32s)

Open AI systems often lag behind closed AI systems due to fewer resources and incentives, but they are crucial for the field to keep up with major players.
The final step to achieving Artificial General Intelligence (AGI) involves finding the optimal interface between AI systems and humans.
The current sequential approach to AI technology development is not ideal. Instead, we should consider how humans will use AI systems and design the entire solution end-to-end.
More attention should be given to developing effective ways for humans to interact with, supervise, and correct AI systems as they become more intelligent.
Human interaction with AI systems is richer and more effective than writing instructions, especially as systems become more intelligent.

Quick-Fire Round (54m18s)

David Luan thinks agents and chatbots will diverge into two distinct products: one for practical tasks and the other for therapeutic or entertainment purposes.
The biggest misconception about AI in the next decade is that it will fully automate every human capability. Instead, AI will likely serve as a tool to enhance human intelligence.
In five years, agents could become like a non-invasive brain-computer interface, allowing humans to think and reason at a higher level.
One reason why this vision of agents might not happen is the resistance from incumbent software companies that bundle software into specific functional areas, preventing agents from bridging different domains.
Luan believes that the potential market for agents is much larger than that of robotic process automation (RPA) because agents can address a wider range of tasks.