In a fast-changing world where AI actively shapes the future, it’s crucial to stay informed. Thanks to the annual State of AI Report, authored by Nathan Benaich and the Air Street Capital team, we have a comprehensive overview of the latest AI breakthroughs, trends, and challenges. Here we highlight key takeaways from the report, providing a glimpse into the state of AI in 2023:
Research: Technology Breakthroughs and Their Capabilities
- GPT-4: The latest OpenAI’a model GPT-4 stands out as the most capable AI model, which significantly outperforms GPT-3.5 and excels in coding capabilities.
- Autonomous Driving: LINGO-1 by Wayve adds a vision-language-action dimension to driving, potentially improving the transparency and reasoning of autonomous driving systems.
- Text-to-Video Generation: VideoLDM and MAGVIT lead the race of text-to-video generation, each using distinct approaches — diffusion and transformers, respectively.
- Image Generation: Assistants like InstructPix2Pix and Genmo AI’s “Chat” enable more controlled and intuitive image generation and editing through textual instructions.
- 3D Rendering: 3D Gaussian Splatting, a new contender in the NeRF space, brings high-quality real-time rendering by calculating contributions from millions of Gaussian distributions.
- Small vs. Large Models: Microsoft’s research shows that small language models (SLMs) when trained with specialized datasets, can rival larger models. The TinyStories dataset represents an innovative approach in this direction: Assisted by GPT-3.5 and GPT-4, researchers generated a synthetic dataset of very simple short stories that capture English grammar and general reasoning rules. Training SLMs on these TinyStories revealed that GPT-4, used for evaluation, preferred stories generated by a 28M SLM over those produced by GPT-XL 1.5B.
- AI’s Growing Role in Medicine: Models like Med-PaLM 2 showcase AI’s increasing prominence in medicine, even surpassing human experts in specific tasks. Google’s Med-PaLM 2 achieved a new state-of-the-art result through LLM improvements, medical domain finetuning and prompting strategies. The integration of MultiMedBench, a multimodal dataset, enabled Med-PaLM to extend its capabilities beyond text-based medical Q&A, demonstrating its ability to adapt to new medical concepts and tasks. Moreover, latest computer vision techniques show effectiveness in disease diagnostics.
- RLHF: Reinforcement Learning from Human Feedback remains a dominant training method. This approach played a significant role in enhancing LLM safety and performance, as exemplified by OpenAI’s ChatGPT. However, researchers explore alternatives to reduce the need for human supervision, addressing concerns related to cost and potential bias. These alternatives include self-improving models that learn from their own outputs and innovative approaches that reduce reliance on RLHF, such as the use of carefully crafted prompts and responses for model fine-tuning.
- Watermarking: As AI’s content generation abilities advance, there’s a growing demand for watermarking or labeling AI-generated outputs. For instance, researchers at the University of Maryland are working on inserting subtle watermarks into text generated by language models, and Google DeepMind’s SynthID embeds digital watermarks in image pixels to differentiate AI-generated images.
- Data Limitations: There’s concern over exhausting human-generated data, with projections suggesting potential shortages by 2030 to 2050. However, speech recognition systems and optical character recognition models might expand data availability.
- LLaMa-2: While commercial models dominate the field, ongoing efforts focus on producing high-performing models through open-source approaches, exemplified by Meta’s LLaMa series.
- Non-Disclosure: Increased economic stakes and safety concerns have led to a culture of opacity around cutting-edge research. OpenAI and Google have moved towards not disclosing detailed information about their top models, like GPT-4 and PaLM-2.
Industry: Commercial Applications and the Business Impact of AI
- NVIDIA’s Dominance: NVIDIA achieved a record Q2 ‘23 data center revenue of $10.32B and entered the $1T market cap club.
- GenAI Dominance: The most prominent trend is the rise of GenAI. Moreover, GenAI played a crucial role in stabilizing AI investments in 2023. Without GenAI, AI funding would have significantly declined.
- Top Sectors Benefitting from AI: Enterprise Software, Fintech, Healthcare.
- Public Market Dynamics: Public valuations are showing signs of recovery. AI-integrated giants such as Apple, Microsoft, NVIDIA, Alphabet, Meta, Tesla, and Amazon play a crucial role in boosting the S&P 500.
- Private Market Trends: The US holds a dominant position in the global private AI sector, with 70% of capital investments in 2023. In contrast, European AI businesses faced a sharp decline in capital support.
- Major Mergers and Acquisitions: The M&A market remains active, with significant acquisitions like MosaicML + Databricks ($1.3B), Casetext + Thomson Reuters ($650M), and InstaDeep + BioNTech (€500M).
- Corporate Investment Dynamics: 24% of all corporate venture capital investments in 2023 were directed into AI companies.
- Funding Dynamics: GenAI companies dominate mega funding rounds, often directed at acquiring cloud computing capacity for large-scale AI system training. In 2023, GenAI companies notably receive larger seed and Series A rounds compared to other startups.
Politics: Regulation of AI, Economic Implications, and the Evolving Geopolitics of AI
- UK and India’s Light-Touch Regulation: The UK and India embrace a pro-innovation approach, investing in model safety and securing early access to advanced AI models.
- EU and China’s Stringent Legislation: The EU and China have moved towards AI-specific legislation with stringent measures, especially regarding foundation models.
- US and Hybrid Models: The US has not passed a federal AI law, with individual states enacting their own regulations. Critics view these laws as either too restrictive or too lenient.
- Regulation and Transparency: The upcoming 2024 US presidential election raises concerns about AI’s role in politics, prompting the US Federal Election Commission to call for public comment on AI regulations in political advertising. Google’s policy on disclaimers for AI-generated election ads is an example of transparency efforts.
- AI and Bias: AI bias accusations, particularly from US conservative groups, suggest that cultural conflicts are spilling over into the AI sphere. OpenAI is addressing these issues through moderation and user fine-tuning.
- Evolving Geopolitics of AI: The semiconductor industry, essential for advanced AI computation, has become a focal point in US-China geopolitical tensions, with broader implications for global AI capabilities.
- Job Market Impact: Research suggests AI advancements may result in substantial job losses in professions like law, medicine, and finance. However, AI could also potentially democratize expertise and level the playing field in skill-based jobs.
Safety: Identifying and Mitigating Catastrophic Risks Posed by Highly-capable Future AI Systems
- Call to Address Safety: Concerns over highly capable AI systems have prompted an open letter from the Future of Life Institute, calling for a pause on AI development more powerful than GPT-4 to address safety. However, there’s no consensus on specific risks or the time horizon over which they might become relevant.
- Mitigation Efforts: AI labs are implementing their own mitigation strategies, including toolkits to evaluate dangerous capabilities and responsible scaling policies with safety commitments. Moreover, API-based models, such as those from OpenAI, have the infrastructure to detect and respond to misuse in adherence to usage policies.
- Open vs. Closed Source AI: The debate continues on whether open-source or closed-source AI models are safer. Open-source models promote research but risk misuse, while closed-source APIs offer more control but lack transparency.
- Pretraining Language Models with Human Preferences: Instead of the traditional three-phase training, researchers suggest incorporating human feedback directly into the pretraining of LLMs. This approach, demonstrated on smaller models and adopted in part by Google on their PaLM-2, has been shown to reduce harmful content generation.
- Constitutional AI and Self-Alignment: A new approach relies on a set of guiding principles and minimal feedback. Models generate their own critiques and revisions, which are used for further finetuning. This could potentially be a better solution than RLHF as it avoids reward hacking by explicitly adhering to set constraints.
- Jailbreaking and Model Safety: Addressing issues related to crafting prompts that bypass safety protocols remains a challenge.
Predictions: What Could Happen in the Next 12 Months?
- A Hollywood-grade production makes use of GenAI for visual effects.
- A GenAI media company is investigated for its misuse during in the 2024 US election circuit.
- Self-improving AI agents crush SOTA in a complex environment.
- Tech IPO markets unthaw and we see at least one major listing for an AI-focused company.
- The GenAI scaling craze sees a group spend >$1B to train a single large-scale model.
- The US’s FTC or UK’s CMA investigate the Microsoft/OpenAI deal on competition grounds.
- We see limited progress on global AI governance beyond high-level voluntary commitments.
- Financial institutions launch GPU debt funds to replace VC equity dollars for compute funding.
- An AI-generated song breaks into the Billboard Hot 100 Top 10 or the Spotify Top Hits 2024.
- As inference workloads and costs grow significantly, a large AI company acquires an inference-focused AI chip company.
Amid the rapid pace of AI development, the “State of AI Report 2023” stands out by underscoring a crucial point: AI’s development is never isolated; it is always integrated into the broader socioeconomic factors. Thanks again to the authors for providing such a comprehensive and forward-looking perspective on the subject.