Top AI News, February 2024

In this monthly recap, we highlight the top AI news stories from February:

AI Regulation

EU countries have unanimously approved the world’s first comprehensive rulebook for AI, marking a significant milestone in AI regulation. A compromise was established with a tiered approach, introducing horizontal transparency rules for all AI models and additional obligations for powerful models deemed to pose systemic risks. The approval reflects a balance between ensuring competitiveness and transparency while avoiding overburdening companies. 

The Biden-Harris administration has initiated the AI Safety Institute Consortium (AISIC), under the U.S. Department of Commerce’s National Institute of Standards and Technology, to address the safety and trustworthiness of AI development. In response to President Biden’s AI executive order, the consortium, comprising over 200 participants, includes major tech players such as OpenAI, Google, Microsoft, Apple, Amazon, Meta, NVIDIA, Adobe, and Salesforce. Academic institutions like MIT, Stanford, and Cornell, as well as industry researchers and think tanks, are also part of the consortium. Mandates for developing guidelines on red-teaming (a process of challenging systems to identify vulnerabilities), capability evaluations, risk management, safety, security, and watermarking synthetic content will be carried out by AISIC. 

Why it matters: This development signifies pivotal legislative advancements, contributing to the ongoing evolution of the global regulatory framework for AI governance.

OpenAI Implements C2PA Watermarks

OpenAI is integrating watermarks from the Coalition for Content Provenance and Authenticity (C2PA) into its DALL-E 3 image generator. This move with the broader initiative to establish the provenance of AI-generated content, championed by the C2PA — a Joint Development Foundation project formed by Adobe, Arm, Intel, Microsoft, and Truepic. The CR symbol, displayed as a pin in the top left corner of images, along with invisible metadata, enables users to verify the use of AI in the content creation process. This verification can be done through tools like Content Credentials Verify, an open-source application supported by the C2PA standard and developed by a cross-industry community.

While the watermarking process has negligible effects on latency and image quality, OpenAI acknowledges that social media platforms may unintentionally or intentionally remove the metadata, posing challenges to its effectiveness.

Between the lines: This step highlights the industry’s growing dedication to transparency, aligning with the trend we anticipated in the copyright framework, influencing the legal framework. 

Plagiarism in GPT-3.5 Outputs

A new report from plagiarism detector Copyleaks reveals that 60% of outputs from OpenAI’s GPT-3.5 model, which powered ChatGPT in its debut, contained some form of plagiarism. This finding raises concerns among content creators, including authors and songwriters, who argue that generative AI trained on copyrighted material might produce exact copies, leading to legal disputes. Copyleaks uses a proprietary scoring method to assess plagiarism, considering factors like identical text, minor changes, and paraphrased content.

In response to the report, OpenAI spokesperson Lindsey Held clarified that the models were designed and trained to learn concepts in order to help them solve new problems but have measures in place to limit inadvertent memorization.

Between the lines: Commenting on the report, Lindsey Held also highlighted that OpenAI’s terms of use prohibit the intentional use of the models to regurgitate content. This statement gains even more significance in the context of The New York Times’ copyright infringement lawsuit, where OpenAI is seeking dismissal, alleging that the newspaper “hacked” its ChatGPT chatbot and AI systems to create misleading evidence. OpenAI claims that the Times manipulated its technology to reproduce material through deceptive prompts violating OpenAI’s terms of use.

Sora

OpenAI introduces Sora, a text-to-video model capable of generating videos up to a minute long based on textual prompts. It can create complex scenes with multiple characters, specific motions, and details. While the current model may struggle with simulating complex physics or understanding specific cause-and-effect instances, it shows promise in generating compelling characters and vibrant emotions.

The context behind: OpenAI is making Sora available to red teamers for risk assessment and seeks feedback from visual artists, designers, and filmmakers to enhance its capabilities. 

Gemini

Google has rebranded and consolidated its AI initiatives under the name “Gemini,” encompassing the Bard chatbot and Duet AI features from Google Workspace. Alongside this, Google introduced Gemini Advanced, Google’s most capable AI mode, 1.0 Ultra, and rolled out a new mobile app for Android and iOS.

Why it matters: The move underscores the company’s dedication and the importance it places on its AI technology for its overall strategy and future growth. 

Microsoft in India

Microsoft’s CEO, Satya Nadella, asserted the company’s leading position in the field of AI during an event in Mumbai, claiming that even a year after GPT-4’s introduction, Microsoft possesses the best model. Nadella encouraged Indian businesses to embrace AI to enhance productivity and improve products. Emphasizing India’s growing importance in the AI talent pool, he announced Microsoft’s plan to provide AI training opportunities to two million individuals in smaller Indian cities and towns by the next year. 

Between the lines: One such unique offering as Nadella highlighted and as we have already mentioned, is Karya, an “ethical data company” focusing on creating datasets in multiple Indian languages to train AI models while fostering employment and education in Indian rural areas. This move reflects Microsoft’s commitment to AI development and education in India, positioning the nation as a significant player in the global AI landscape

Mistral AI’s Open-Source LLM miqu-1-70b

Mistral CEO Arthur Mensch confirmed the emergence of a new open-source large language model (LLM) named “miqu-1-70b.” Initially leaked by an over-enthusiastic employee, this model, speculated to be a Mistral creation, is causing a stir due to its potential to approach or even exceed the performance of OpenAI’s GPT-4. The leaked files, posted on HuggingFace and 4chan, garnered attention for their exceptional performance in LLM tasks, challenging the dominance of GPT-4. Mistral’s confirmation suggests active development and hints at a competitive landscape shift, potentially challenging established players in the field. 

Why it matters: This development could mark a pivotal moment for open-source GenAI, posing significant competition for proprietary models like GPT-4. The race to innovation in the open-source AI community is accelerating, questioning whether OpenAI can maintain its lead in the LLM domain.

Groq’s LPUs

Groq, an AI chip company, is making waves with its Language Processing Units (LPUs), claiming to be the world’s fastest for running large language models. The LPUs, which outpace Nvidia’s GPUs, serve as an “inference engine” to accelerate AI chatbots like ChatGPT and Gemini. Demonstrations show Groq producing hundreds of words in a factual answer within a split second, outperforming current chatbots and potentially overcoming delays in real-time human speech. In a third-party test, Groq’s LPUs achieved 247 tokens/second compared to Microsoft’s 18 tokens/second, suggesting that models like ChatGPT could run more than 13 times faster using Groq’s chips. 

Why it matters: The company’s AI chips could revolutionize real-time communication with chatbots, addressing current limitations and opening new possibilities for practical applications.

Releases in AI Models

Our colleague and Team Leader of Everypixel Alexander Shironosov, has deepened our exploration of recent releases in AI models:

Stable Cascade by StabilityAI: a text-conditional model, trained within highly compressed latent space, facilitates faster inference and cost-effective training. Stable Cascade achieves an impressive compression factor of 42, allowing the encoding of a 1024×1024 image to a mere 24×24 while maintaining sharp reconstructions.This model also supports various extensions such as finetuning, LoRA, ControlNet, IP-Adapter, LCM, making it versatile for different applications.

SDXL-Lightning: a text-to-image generation model, is introduced as a distilled variant from stability/stable-diffusion-xl-base-1.0. This model achieves high-quality 1024px image generation in just 1/2/4/8 steps. Open-sourced as part of the research, it offers full UNet and LoRA checkpoints as well as checkpoints for 1-step, 2-step, 4-step, and 8-step distilled models. 

Playground 2.5: Playground has unveiled version 2.5 of its text-to-image generative open-source model. The model addresses challenges such as muted color and contrast inherent in latent diffusion models, achieving notable improvements. Prioritizing human preference alignment, the model enhances details such as facial clarity and hair texture. User studies also showed that Playground v2.5 notably outperforms SDXL and PixArt-α, showcasing a remarkable 4.8x improvement over SDXL.

Vision-arena: developed as a research project by the WildVision team at Allen Institute for AI (AI2), UCSB, and UWaterloo, Vision-arena was inspired by the successful Chatbot Arena — a benchmark platform for LLMs, enabling anonymous, randomized battles and ranking models through the Elo rating system which has already collected over 200,000 human preference votes. The WildVision team introduced a similar approach for Vision Language Models (VLMs) and crafted a web demo on @huggingface, allowing users to conduct side-by-side assessments of different VLMs, including GPT-4V, Gemini-Pro-Vision, and Llava. 

Miscellaneous

Google’s Gemini AI Tool Faces Criticism

Google apologized for inaccuracies in historical image generation produced by its Gemini AI tool. The controversy emerged when users criticized Gemini for depicting specific white figures or groups, like Nazi-era German soldiers, as people of color. Google stated that it is actively working to address these inaccuracies.

Spread the word

Posted

in

by