xCruzo
|
Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable
Tech

Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable

Cyber TechCrunch ✦ xCruzoAi 🇺🇸🇪🇸
📄 Read Article
— Ai Summary —

Anthropic's Fable, released as a public and limited version of Mythos, is drawing criticism from cybersecurity researchers who say the guardrails hinder practical work. When prompts trigger safety measures on cybersecurity or biology topics, Fable may pause and credit the restriction or fall back to Claude Opus 4.8, highlighting the guardrails' keyword-based nature. The restrictions are intended to limit malware development and other risks associated with cyber threats and biological threats, underscoring why Anthropic initially rolled back Mythos to a restricted set in Project Glasswing. Researchers describe guardrails as inconsistent and overly cautious, noting that asking for a code review can trigger restrictions and hamper software engineering practices. Some experts anticipate evolution as Anthropic and peers collaborate with the security community. Anthropic did not respond to requests for comment, but the company points to its Cyber Verification Program, which allows approved cybersecurity professionals fewer limitations on Claude for their work, echoing OpenAI's Trusted Access for Cyber.

AI-generated summary • Source: TechCrunch • Read the full article for complete information.
📄 Read Full Article →