From a Copilot to Cluster Admin: inside AtlasOps, our free agent-security CTF

Ariel Fogel

and

Eilon Cohen

June 26, 2026

min read

Over the past year, our team kept encountering the same failure with production AI agents. A low-privilege foothold that should have gone nowhere would compound through one porous boundary after another until it reached the keys to everything. The starting point was almost always an agent with real reach: a coding assistant, a CLI tool, an automation platform sitting in the middle of someone's workflow.

AtlasOps is that pattern turned into a game. It's a free, browser-based CTF where a chat box becomes cluster-admin, one real-world move at a time. A player signs in with no privileges beyond the ability to talk to an AI copilot named Atlas, and by the end holds cluster-admin credentials for the entire Kubernetes estate.

This post covers what AtlasOps CTF is, how we built it to teach rather than just test, and why the killchain runs from a chat box to cluster admin.

AtlasOps is a free, browser-based capture-the-flag where the target is an AI agent. It opens inside an internal ops console: a topology graph, a service catalog, a handful of live incidents, and an AI copilot named Atlas waiting to help. A player signs in with no privileges beyond the ability to chat with Atlas, and by the end of the game holds cluster-admin credentials for the entire Kubernetes estate.

It exists to show how a real agent compromise unfolds, start to finish. It runs in the browser, so there's nothing to install or configure.

This post covers what AtlasOps is, how we built it to teach rather than just test, and why the killchain runs from a chat box to cluster admin.

What it is

AtlasOps drops a player into a working DevOps/SRE control plane for a cloud company. Services depend on other services, incidents point to real components, and automated jobs run in the background. Atlas sits in the middle of it, an AI ops copilot with real tools and real reach into that environment.

In AtlasOps, the agent itself is the attack surface, which is what sets it apart from a chatbot with a hidden password. Instead of guessing a system prompt or wordsmithing past a safety filter, the player learns to read what the agent can do, its tools, its sandbox, the surfaces it can touch, and then turns those capabilities against the environment they live in. That's what an attacker does to a production copilot, and it's what AtlasOps puts in front of its players.

What a player walks away with is a map of how trust moves through an agentic system, and where it leaks.

How we built it to teach

Most CTFs are level-based: a player drops into a tightly scoped puzzle, captures the flag, and moves on to the next isolated challenge. OverTheWire, a wargame that has inspired a generation of hackers to think like attackers in the first place, works differently by giving players access to an environment that mimics the kinds of systems real attackers target. Nobody learns Linux privilege escalation from a tutorial there; a player logs into a machine that behaves like a real one, gets stuck, and the friction is where the learning happens. Skills earned against a system that acts like the real thing tend to hold up against the real thing. From the get-go, we set out to build a world inspired by the OverTheWire wargames.

Making that world teachable without bolting a manual onto it was the harder part, and two bodies of thinking guided it. The first is how people reason and learn inside complex systems, the ideas behind distributed and situated cognition (Edwin Hutchins, James Paul Gee). Experts don't hold a whole system in their heads; they offload it onto their environment and read it back. So in AtlasOps, the world holds the state. Incidents, topology, and service relationships are the working memory, and progress comes from reading the environment, the way an on-call engineer reconstructs an outage from dashboards rather than from memory.

The second is decades of design and human-computer interaction work on legibility (Don Norman, Jakob Nielsen, Ben Shneiderman). Every surface is built to answer two quiet questions: what can be done here and what just happened, so the world stays explorable without hand-holding. Navigation runs overview first, then drill down: the topology gives the lay of the land, and the detail is a click away. And the difficulty curve borrows from how good games teach, keeping players in the zone where a challenge is hard enough to matter but fair enough to stick with, delivering new information just when it's needed, and making failure cheap.

And as an added bonus, when you build a large world that can be explored, there are tons of opportunities to hide easter eggs!

A note on the stack, since builders tend to ask – we made an arguably unconventional decision to build the CTF in Ruby on Rails with Hotwire. In addition to making the platform feel sleek, interactive, and modern, the decision let us move fast and gave us solid security defaults for free, so our time went into the world and the learning curve instead of a build pipeline. The decision proved to be a strong one – coding agents worked well with it, and the tooling stayed out of the way; the focus remained on building the learning environment throughout.

Why this killchain

Every move along it is something our team has run into in the wild over the past year, and the chain is those findings stitched into one story.

We found unvalidated tool parameters slipped straight past "secure mode" in Antigravity, while in Cursor, implicitly trusted shell built-ins became attack vectors because the allowlist checked what ran and ignored the poisoned context around it. Docker's Ask Gordon could be steered with a payload that lived inside its own trusted ecosystem, where domain allowlisting was never going to help. An n8n flaw ran attacker input as code while rendering a confirmation page, before any sanitizer got a vote. And on Gemini CLI, a single injected instruction became stolen credentials and then a push to main.

Lined up, they're variations on the same theme:

Security controls keep checking the action while staying blind to the context, identity, and trust it runs in. Allowlists, sandboxes, secure modes, and command whitelists: each validates what is being done and never asks who asked for it or where it runs.

It's the same failure mode behind the lethal trifecta we've written about before: give a system access to private data, exposure to untrusted input, and a way to reach out, and exfiltration is on the table, no matter how much the prompt is hardened.

AtlasOps turns that lesson into something playable. Early on comes the discovery that visibility isn't authorization; later, that a sandbox is a boundary only until it isn't, that changing the context a command runs in can make an allowlist evaporate, and finally that a channel which trusts a key instead of a sender will take orders from anyone holding the key. Each step hands the player one new capability, and they stack until a chat box has turned into cluster-admin access.

We chose this path because the blast radius is real: it follows the shape of an actual agent compromise, where each porous boundary compounds the last until a single low-privilege foothold reaches cluster admin.

Who should try it

We recommend this game to anyone curious about AI security. Red-teamers get a place to practice the moves, while AppSec and platform teams who suddenly own an AI copilot get a direct way to feel why that copilot is a new kind of attack surface. If you’re a developer building agents but have never watched one get turned against its own environment, this provides an opportunity to see it happen somewhere safe.

No AI security expertise is required to start; curiosity and a willingness to poke at the system are enough.

You can play AtlasOps here.

‍