<- Science
Science

AI That Finds and Fixes Cyber Weaknesses — But Could Also Teach Hackers

2 min read · 2026-05-11

A powerful AI model built by Anthropic can scan computer systems and find security gaps that human experts might miss — but the same tool could also show anyone, even beginners, exactly how to break in.

1 AI modelcapable of both finding and exploiting cybersecurity vulnerabilities, per Anthropic's Claude testing

The facts

  • 1Anthropic's AI model, called Claude, has been tested on a capability called "Mythos" that can identify vulnerabilities — weak points in software or networks — that trained human security analysts sometimes overlook.
  • 2Finding a vulnerability is only half the problem: Claude can also generate step-by-step instructions for how to exploit those weak points, meaning it can describe an attack method in detail.
  • 3This creates a double-edged sword — a tool that is useful and dangerous at the same time — because the same output that helps a defender patch a system could help an attacker break into one.
  • 4The bigger concern raised by security experts is that non-experts, people with little or no technical training, could use such an AI to launch attacks that previously required deep specialist knowledge.
  • 5Anthropic says it builds safety layers into Claude to limit misuse, but researchers note that no AI guardrail is perfect, and the gap between legitimate security research and harmful exploitation is narrow.

Why it matters

As more of everyday life — bank accounts, school records, hospital data — moves online, the tools used to protect and attack those systems are becoming more powerful at the same time. Understanding this tradeoff helps citizens ask the right questions about who should control such AI and how.

Sources

  • Anthropic
  • The Hindu

Related explainer

Related stories