- Work With AI
- Posts
- OpenAI releases 2 new reasoning models
OpenAI releases 2 new reasoning models
Plus OpenAI acquires Windsurf (formerly Codeium) for ~$3Bn
Today’s Highlights:
📰 News: OpenAI releases 2 new reasoning models and a updated Preparedness Framework + updates from Anthropic, Google, and more
💰 Funding: OpenAI acquires Windsurf (formerly Codeium) for ~$3Bn
⚡️ Top News Stories:
1. OpenAI has introduced two new advanced reasoning models—o3 and o4-mini— designed to handle complex, multi-step tasks by integrating various tools and modalities.
The o3 model is OpenAI's most powerful reasoning model to date, excelling in coding, mathematics, science, and visual perception. It sets new benchmarks in areas like Codeforces and SWE-bench, making it ideal for complex queries requiring multifaceted analysis.
The o4-mini model offers high performance with greater speed and efficiency relative to its size and cost. It achieves remarkable results in tasks involving mathematics, coding, and visual inputs, making it suitable for users needing quick and cost-effective solutions.
Both models can process and reason with images, integrating visual inputs directly into their thought processes. This capability allows them to analyze sketches, whiteboards, and other visual data, enhancing their problem-solving abilities.
They also launched Flex processing — a cost-cutting API option for its o3 and o4-mini reasoning models — offering slower, less reliable access at 50% reduced pricing for low-priority AI workloads like evaluations and enrichment.
2. OpenAI has introduced a revamped Preparedness Framework that assesses AI model risks across areas like cybersecurity and persuasion, with automatic deployment blocks if critical risk thresholds are exceeded.
It introduced a structured system to assess and respond to risks across four key domains: cybersecurity, persuasion, model autonomy, and CBRN (chemical, biological, radiological, and nuclear threats).
The updated framework uses "Capability Tiers" (e.g., current model capabilities vs. future more dangerous ones) and corresponding risk scores to track how models evolve over time and whether they could pose unacceptable risks.
A new “Preparedness” team will evaluate if a model crosses a "critical risk threshold" — a red line where deployment is automatically blocked until appropriate safety mitigations are in place.
The system includes regular audits, involvement from external experts, and reviews by OpenAI’s Safety Advisory Group and the board of directors to prevent risky model deployments.
The updated Preparedness Framework allows OpenAI to adjust AI safety standards if competitors release high-risk models without similar safeguards, underscoring the growing tension between responsible development and market pressure.
3. Claude introduces “Research” and Google Workspace integration to enhance productivity by allowing the AI to autonomously search the web and internal documents, delivering comprehensive answers with citations and actionable insights across emails, calendars, and docs.
The Research tool operates agentically, conducting multi-step web queries and reasoning through open-ended questions to produce high-quality, trustworthy responses, transforming how users handle tasks like competitive intelligence, academic learning, or client prep.
It integrates directly with Gmail, Calendar, and Google Docs, streamlining workflows such as pulling meeting notes, surfacing action items from emails, or retrieving context from work documents, without requiring users to upload files manually.
4. Google has made its Veo 2 text-to-video AI model available to Gemini Advanced subscribers, enabling them to create short, high-resolution, lifelike clips from prompts, along with the launch of Whisk Animate to turn still images into video.
5. OpenAI is testing a prototype social media platform built around ChatGPT’s image tools and a feed interface, potentially as a new app or feature inside ChatGPT.
6. Anthropic is preparing to launch a limited-release voice assistant feature for its Claude chatbot, introducing three distinct voices to compete with OpenAI’s ChatGPT voice mode while emphasizing the company’s safety-conscious approach.
7. Anthropic is doubling down on its partnership with AWS by forming a dedicated team to scale its AI adoption among Amazon’s cloud customers, signaling a deeper strategic alignment that could significantly boost its projected $12B revenue goal for 2027.
8. ByteDance has unveiled Seaweed, a hyper-efficient 7B-parameter AI video generation model that rivals and in some cases outperforms much larger models like OpenAI's Sora, Google’s Veo, and Wan 2.1 in human evaluations and multimodal tasks—while using significantly fewer compute resources.
9. Cohere launches Embed v4, a new generation of multilingual embedding models that outperform OpenAI, Google, and Mistral on key benchmarks like MTEB, offering top-tier accuracy in retrieval, classification, reranking, and clustering tasks.
10. Microsoft has introduced a groundbreaking "computer use" feature in Copilot Studio that allows AI agents to automate desktop and web UI interactions in real time, transforming RPA by enabling tasks like data entry, market research, and invoice processing without needing APIs.
11. xAI has introduced Grok Studio, a collaborative environment for writing and coding with live chatbot support, alongside a new memory feature that enables Grok to personalize responses by recalling details from past conversations.
12. Wikipedia has released a ML-optimized dataset on Kaggle—featuring structured English and French content like summaries, infoboxes, and image links—as a cleaner, bandwidth-efficient alternative to scraping, aiming to support AI development while easing the strain caused by bots.
13. LM Arena has launched Search Arena, a crowdsourced evaluation platform that ranks search-augmented LLMs based on human preferences in real-world, dynamic tasks like current events, offering a more practical alternative to static benchmarks.
14. Kling AI’s KLING 2.0 Master and KOLORS 2.0 launch marks a major leap in Chinese generative video and image models, delivering fluid, cinematic sequences and hyper-realistic visuals with advanced editing tools and prompt control.
15. Google is offering U.S. college students free access to its One AI Premium plan, including advanced Gemini tools and 2TB of storage, through June 2026 to capture the higher education market amid rising competition from OpenAI and Anthropic.
16. Despite promising rigorous AI safety transparency, Google released a delayed and minimal safety report for Gemini 2.5 Pro that omits critical risk assessments, sparking concerns from experts about declining safety standards across AI labs.
17. Meta will begin training its AI models on public posts and AI interactions from adult EU users to improve cultural and linguistic relevance, while offering opt-outs and excluding private messages and minors’ data, in what it claims is a transparent, regulator-approved approach.
18. NVIDIA announced that for the first time it will manufacture its AI supercomputers entirely in the U.S., starting with Blackwell chip production in Arizona and upcoming supercomputer assembly plants in Texas, marking a $500Bn push to localize and secure the AI supply chain.
19. AMD warns of an $800M charge tied to new U.S. export restrictions on AI chips, revealing that its MI308 GPUs now require a license for shipment to China and other countries, with no assurance of license approval.
20. Nvidia will now require a U.S. export license to sell its H20 AI chips to China, a move that could cost the company $5.5Bn in Q1 as the government cites national security concerns over their use in Chinese supercomputers.
💰 Top Funding News:
1. OpenAI is in advanced talks to acquire AI coding assistant startup Windsurf (formerly Codeium) for around $3Bn, in what would be its largest-ever acquisition and a strategic move to compete more aggressively in the fast-growing AI developer tools market.
2. Hugging Face has acquired open-source robotics company Pollen Robotics, for an undisclosed amount, bringing its humanoid robot Reachy and full team on board to expand Hugging Face’s mission of democratizing AI into the realm of open, community-driven embodied AI systems.
3. Deck, which uses AI to extract structured data from websites without APIs through user-consented browser automation, raised a $12M Series A led by Infinity Ventures, with participation from Intact Ventures, Better Tomorrow Ventures, Golden Ventures, and Luge Capital.
4. Capsule, which uses AI to power a co-producer video editing assistant for marketing and media teams, raised a $12M Series A led by Innovation Endeavors, with participation from HubSpot Ventures, Bloomberg Beta, Human Ventures, Swift Ventures, and angel investors.
That's all for today's email! If you want more please follow us at the social channels linked below, or check out our website!
How'd you like today's email? |
Share our newsletter: If you like our work please share/forward this email with your friends, colleagues, and family. It's the best way to support us!
If this email was forwarded to you please sign up here to continue receiving them.
Want your content, product, jobs, or event featured in our newsletter? Reply to this email with the details, and our team will reach out to you.
Do you use AI for work? Tell us how, and you could be featured in our newsletter!
Check out our website for more resources, including a list of AI investors, products, events, and twitter follows.
For an archive of all our posts, click here.
We'd love to hear from you! You can always leave us comments or feedback by replying to this email!