In today's episode of the Daily AI Show, Brian, Beth, Andy, Karl, and Jyunmi discussed whether frontier AI models can continue growing at a 5x per year rate. The conversation was sparked by a report from EpochaI.org, which analyzed the training compute of frontier AI models and found a consistent growth rate of 4 to 5 times annually. The co-hosts explored various factors contributing to this growth, including algorithmic efficiencies and novel training methodologies.
Key Points Discussed:
Training Compute and Frontier Models:
- Definitions Clarified: The discussion began with defining key terms such as 'compute' (measured in flops) and 'frontier models' (top 10 models in training compute).
- Historical Context: The training compute has grown dramatically, with the pre-deep learning era (1956-2010) following Moore's law, the deep learning era (2010-2015) doubling every six months, and the large-scale era (2015-present) doubling every 10 months.
Alternative Methods to Frontier Model Training:
- Evolutionary Model Merge: Combining existing models requires significantly less compute compared to training new models.
- Mixture of Experts and Depth: Techniques like mixture of experts, smaller model gangs, and mixture of depths optimize the training process.
- JEPA (Joint Embedding Predictive Architecture): This method predicts abstract representations, increasing efficiency by learning from less data.
Algorithmic Efficiencies and Unhobbling:
- Improved Algorithms: The algorithms themselves have become more efficient, drastically reducing the inference cost.
- Unhobbling Techniques: Methods like chain-of-thought prompting, RLHF (reinforcement learning for human feedback), and scaffolding enhance the model's ability to solve complex problems step-by-step, rather than instantaneously.
Business Implications and Future Outlook:
- Business Adaptation: Companies should plan for continuous improvements in AI capabilities, focusing on building solutions that can evolve with the technology.
- Data and Environmental Considerations: As AI training approaches the limits of available data, synthetic data and curated datasets like FindWeb will become crucial. Sustainability and logistical challenges in compute and chip manufacturing also need to be addressed.
- Predicted Growth: Despite potential bottlenecks, the consensus is that AI models will continue to grow at a rapid pace, potentially surpassing human cognitive benchmarks within a few years.