Alex Wang: Why Data Not Compute is the Bottleneck to Foundation Model Performance | E1164

12 Jun 2024 (6 months ago)
Alex Wang: Why Data Not Compute is the Bottleneck to Foundation Model Performance | E1164

Intro (0s)

  • Alex Wang believes that AI technology has the potential to be a significant military asset, potentially even more impactful than nuclear weapons.
  • China's centralized system allows for aggressive centralized action and industrial policy to drive forward critical industries, giving them an advantage in the race for AI dominance.
  • Alex Wang suggests that the bottleneck to foundation model performance is not compute but data.
  • While compute power has been increasing exponentially, the amount of data available for training these models has not kept pace.
  • The lack of sufficient data leads to models that are less capable and less generalizable.
  • To address this bottleneck, Alex Wang emphasizes the need for more efficient data collection and curation methods, as well as techniques for making better use of existing data.

Diminishing Returns in AI Compute (1m5s)

  • Despite significant advancements in compute power, there hasn't been a substantial breakthrough in foundation model performance since the release of Google's GBD-4 model.
  • Progress in AI comes from advancements in compute, algorithms, and data.
  • The recent plateau in AI performance is due to hitting a "data wall," where easily accessible data from the internet has been exhausted.
  • To overcome this data wall, there needs to be a focus on acquiring more diverse and specialized data to train AI models.
  • The goal of AGI (Artificial General Intelligence) is not just to emulate the internet but to perform tasks, solve problems, and collaborate with humans.
  • Frontier data, which includes complex reasoning chains, discussions, and tool use, is necessary to power the advanced capabilities of AI models.
  • Frontier data can be captured by mining the vast amount of proprietary data locked up in large enterprises.

Solving Reasoning to Overcome Limits (9m8s)

  • One of the most powerful CTOs believes the real breakthrough in overcoming diminishing returns is solving reasoning.
  • Human intelligence is very general and can adapt to new situations, while machine intelligence is very specific and requires data for every scenario it needs to reason in.
  • There are two ways to resolve the reasoning gap in current models: build a general reasoning capability or overwhelm the models with data in every scenario where they need to reason well.
  • The bottleneck to foundation model performance is data, not compute.
  • Models perform well in situations where they have seen a lot of data before.
  • To improve model performance, we need to provide them with more data in the scenarios where we want them to perform well.

From Data Scarcity to Abundance (10m56s)

  • Despite the immense amount of data held by large enterprises like JP Morgan and governments, much of it is proprietary and not accessible for generalized models that could benefit humanity.
  • Synthetic data creation and increasing the means of data production are necessary to move from data scarcity to abundance.
  • Producing data involves a hybrid human-synthetic process, where AI systems generate large amounts of data while human experts guide and provide input to ensure high-quality data.
  • New roles such as AI trainers or contributors are needed to facilitate this process of contributing data to AI models.
  • Contributing data to AI models is a high-leverage job that allows human experts to have society-wide impact by improving the models' capabilities.

Challenges in Structuring Massive Enterprise Data (14m37s)

  • The biggest challenge in data governance is the structure and cleanliness of the data, which hinders efficient model ingestion.
  • Mining existing data provides a one-time benefit but is insufficient for sustained AI progress.
  • Forward data production, involving continuous collection and generation of new data, is crucial for long-term AI advancement.
  • Increasing the data supply can be achieved by collecting more longitudinal data and collaborating with human experts to produce frontier data that expands the models' capabilities.
  • To truly advance AI models, complex data requiring agentic behavior, complex reasoning chains, and advanced knowledge in various fields is necessary.
  • A global, infrastructure-level effort is needed to collect and generate the required data.
  • Data, not compute, is the bottleneck to foundation model performance.
  • Foundation models require massive amounts of data for training and improvement.
  • Collaboration between the world's experts and AI models is essential for producing the best AI systems.

Fair Access to Proprietary Data for Models (18m59s)

  • Data is one of the three pillars (along with algorithms and compute) that can create a sustainable competitive advantage for foundation models.
  • OpenAI's partnerships with organizations like the Financial Times (FT) and Axel Springer provide them with access to proprietary data that other models may not have, giving their content an advantage in certain queries.
  • In the future, labs will focus on differentiating themselves through the data they use and the unique rights they have to different data sources.
  • Companies will start building data strategies to drive more differentiation in the market.
  • Instead of bragging about the number of GPUs they have, researchers and CEOs will brag about the data they have access to and their unique rights to different data sources.

Model Commoditization (22m2s)

  • Data strategy is a potential competitive advantage for AI labs, enabling the creation of unique datasets and differentiated access to new data.
  • Enterprises are cautious about sharing sensitive data with external parties due to the risk of losing their competitive advantage, creating an opportunity for on-premise models that can be customized using their own data.
  • AI services are expected to generate more revenue than AI models themselves in the coming years, but the location of value capture in the AI stack is constantly shifting, making it challenging to predict where the most significant revenue opportunities will lie.
  • Alex Wang believes that data, not compute, is the bottleneck to foundation model performance.
  • Nvidia is currently the most valuable AI company, surpassing Meta, Google, Amazon, and Saudi Aramco.
  • Value exists in all aspects of the AI stack, including infrastructure, applications, and everything above and below the model.

Value Extraction Challenges in AI Commoditization (26m51s)

  • The commoditization of AI features will not necessarily lead to increased pricing or value extraction.
  • Software production and creation costs are decreasing, resulting in more customized software for enterprises.
  • Software engineering will change as models improve, with developers focusing on different tasks.
  • Alex Wang believes that the key to foundation model performance lies in data rather than compute.
  • The valuable part of what AI engineers do is translating customer problems into engineering problems that can be solved by AI.
  • The end of per-seat pricing is likely in the future as more work is done by AI agents and models, leading to a shift towards consumption-based pricing.
  • There is concern that overly restrictive data regulations, such as those seen in the EU, could stifle innovation in AI.
  • More permissive data access regulations are compatible with liberal democracies and necessary for the advancement of AI.
  • The US and UK need to adopt policies that ensure they are not disadvantaged in terms of data production for AI models.
  • There should be centralized and accessible data sets that do not give proprietary advantages to specific players.
  • Examples include safety data in aerospace and fraud and compliance data in financial services.
  • Restrictions in consumer-facing areas, such as HIPAA in healthcare, need to be reviewed to ensure they do not prevent AI progress.
  • Clear anonymization provisions or ways to use existing patient data to improve future health outcomes are needed.

A Military Asset in Global Conflict: China & Russia (36m53s)

  • China's rapid progress in AI has enabled them to catch up to the US and become leaders in industries like solar and EVs, thanks to their centralized industrial policy and aggressive approach.
  • AI technology has the potential to be a significant military asset, potentially even more impactful than nuclear weapons, and the geopolitical environment is increasingly tense with rising conflicts and totalitarian leaders.
  • The concern about AI's military potential necessitates proactive efforts from the Western world to prevent negative outcomes, raising the question of whether to adopt closed systems to mitigate risks, especially considering that open systems provide equal access to potential adversaries like Russia and China.
  • A dichotomy may be needed where advanced AI systems are closed for geopolitical and military reasons, while less advanced versions can be made open for economic value.
  • As of now, models like LLAMA 3 are not considered advanced enough to be a military asset, so it's acceptable to have them open, but it's important to carefully determine the line between which models should be open and which should be closed.

The Future Landscape of Foundation Models (42m49s)

  • Foundation models are incredibly expensive to train, costing hundreds of millions to billions of dollars.
  • In the future, AI efforts will coalesce around nations or large tech companies due to the high costs involved.
  • Only nations or hyperscalers will have the resources to subsidize or underwrite these massive AI programs.
  • The future AI landscape will likely be dominated by a few large players, with smaller players being acquired by large cloud providers like Google, Amazon, or Nvidia.
  • Partnerships between large tech companies and AI startups, such as the OpenAI-Microsoft partnership or the Anthropic-Amazon partnership, will be interesting to watch and could shape the future of the technology.

About Founder Brand & PR & Media (44m52s)

  • Traditional press prioritizes generating clicks over providing genuine education, leading founders to focus on direct channels and podcasts to control their narratives.
  • Building a personal brand is crucial as people follow personalities more than companies, a phenomenon known as the "cult of personality" driven by a deep human need to relate to individuals.
  • Founders should consider traditional PR less relevant and prioritize authentic communication.
  • The media's shift in narratives towards tech companies, particularly during the tech industry downturn in 2022, resulted in a focus on pointing out missteps and criticizing companies like Scale AI and its peers.
  • Scale AI's collaboration with the US Military and the US DOD, which began in 2020, was initially met with criticism from the traditional media for supporting the government and military.
  • Alex Wang's contrasting treatment when testifying before Congress about AI's use in the military highlighted the differing incentives driving their actions, with Congress focused on informed decision-making and the media driven by clickbait and sensationalism.

Hiring (52m11s)

  • Alex Wang stresses the importance of hiring people who are passionate about their work, the company, and making a positive impact. He personally approves every hire to maintain a high hiring standard and ensure the team's quality.
  • Wang sometimes goes against the hiring team's recommendations, relying on his experience and understanding of what makes people successful at Scale AI.
  • He emphasizes the need to identify whether people operate out of fear or freedom and adjust management approaches accordingly.
  • During the rapid team expansion in 2020-2021, Scale AI faced challenges in maintaining high standards and a sense of excellence.
  • The best hires are often those who would have joined regardless of a company's popularity.
  • A self-preserving talent ecosystem that maintains high standards and continuously seeks out the best individuals is more valuable than being the hottest company.
  • Go-to-market functions, such as sales, tend to gravitate towards more popular brands, while core technical development often remains concentrated in the original location, regardless of popularity.
  • Airbnb's Brian Chesky successfully rebuilt the company after the pandemic by shrinking the team, investing in talent density, and focusing on profitability per head, making it one of the most profitable companies in tech.

Quick-Fire Round (1h0m41s)

  • Alex Wang, the founder of Scale AI, emphasizes that data, not compute, is the key factor limiting the performance of foundation models in AI.
  • He believes that the common misconception about AI is the overemphasis on compute power, neglecting the crucial role of data in driving AI progress.
  • Wang expresses admiration for Satya Nadella's business strategies at Microsoft and would choose him as his next board member.
  • Reflecting on his journey since starting Scale AI in 2016, Wang acknowledges a significant evolution in his perspective on AI, having witnessed various eras of the technology.
  • He raises concerns about the potential challenges generative AI may face, drawing parallels to autonomous vehicles where promises outpaced technical capabilities, leading to setbacks.
  • Wang envisions the future of Scale AI in 10 years as continuing its role as a data foundry for AI and actively contributing to the advancement of AI technology.
  • While contemplating the possibility of going public, Wang questions the personal appeal of being a public company CEO compared to the fulfillment he derives from solving enduring problems.

Overwhelmed by Endless Content?