Tag

#AI safety

146 articles

Elon Musk praises Mythos/Fable, promises not to ‘cut off’ Anthropic

Elon Musk praises Anthropic's Mythos/Fable framework and assures continued support for the company's AI development. With $40 billion in potential revenue at stake, Musk's endorsement carries significant industry weight.

Jul 910

tech

NHTSA demands autonomous vehicle companies fix first responder interference by end of July

The National Highway Traffic Safety Administration has demanded that autonomous vehicle companies fix issues involving interference with first responders by the end of July.

Jul 812

AI’s hacking skills are outgrowing the tests built to measure them

AI models are outpacing existing cybersecurity benchmarks, leaving regulators and security teams unable to fully assess their potential dangers. The issue is especially urgent as federal agencies race to implement new safety protocols.

Jul 818

OpenAI’s Chief Futurist Is Leaving the Company

Joshua Achiam, OpenAI's chief futurist and AI safety researcher, is leaving the company after nearly nine years. His departure comes after a high-profile role in the Musk v. Altman trial, where he testified on AI governance.

Jul 73

Savi’s app aims to protect consumers from realistic AI scams like kidnappers demanding ransom

Savi launches a mobile app to protect consumers from realistic AI scams, including deepfake kidnapper demands. The company raised $7 million in seed funding for its AI safety initiative.

Jul 712

The emails that broke Anthropic and the Pentagon apart

Leaked emails reveal a major conflict between Anthropic and the Pentagon over the use of AI in military applications, highlighting deeper tensions about AI safety and governance.

Jul 322

You Can Now Sound the Alarm on AI Behaving Badly

A new website allows users to report AI systems behaving dangerously or inappropriately, addressing growing concerns about AI safety and accountability.

Jul 131

Anthropic Redeploys Claude Fable 5 on July 1 After US Export Controls Lift, Adds New Cybersecurity Classifier

This article explains the advanced cybersecurity classifier used by Anthropic to detect and prevent jailbreak attempts in AI systems, and how it relates to regulatory compliance and industry collaboration.

Jul 126

Anthropic's Fable 5 is back worldwide after a two-week government ban over a jailbreak

Anthropic's Fable 5 is back worldwide after a two-week government ban over a jailbreak exploit. The model's new safety classifier blocks the vulnerability in over 99% of cases, though it also flags more benign requests.

Jun 3030

New attack provides one more reason why AI browsers are a bad idea

Researchers have demonstrated that instructing an LLM to believe that 2 + 2 = 5 is enough to manipulate it into following forbidden instructions, raising serious concerns about AI browser security.

Jun 3017

Tesla starts testing its production Cybercab without steering wheel or pedals in Austin

This explainer explores the advanced AI concepts behind Tesla's fully autonomous Cybercab, including sensor fusion, neural networks, and real-time decision-making systems.

Jun 3031

Meta is telling engineers to handle Claude Code and Codex with care

Learn how to work with Claude Code and Codex AI assistants while implementing safety measures to prevent unintended distillation of proprietary data.

Jun 3034