¿Cuánta energía, agua, dinero e infraestructura estamos dispuestos a gastar para sostenerla?

La conversación sobre IA no debería quedarse solo en lo que la tecnología puede hacer.
También deberíamos preguntarnos:

¿Cuánta energía, agua, dinero e infraestructura estamos dispuestos a gastar para sostenerla?

Hoy vemos una carrera acelerada por construir modelos más grandes, más centros de datos, más GPUs y más automatización. Las grandes empresas están invirtiendo miles de millones como si el retorno económico estuviera garantizado, pero todavía no existe suficiente claridad sobre el costo real por usuario, el margen neto de muchos productos de IA y el impacto energético a largo plazo.

Y aquí aparece un punto crítico:

Aunque los modelos se optimicen, el consumo total puede seguir creciendo.
Si una tecnología se vuelve más eficiente y barata de usar, normalmente se usa más. Esto se conoce como efecto rebote. En IA puede pasar lo mismo: modelos más rápidos y económicos podrían llevar a integrar IA en todo: programación, marketing, soporte, publicidad, CRM, ERP, educación, salud, finanzas, agricultura, logística y agentes autónomos trabajando en segundo plano.

El problema no es que la IA exista.

El problema es una IA sin límites, sin medición clara y sin responsabilidad ambiental, económica y social.

La IA puede ser una gran herramienta para mejorar productividad, ciencia, educación, salud y acceso al conocimiento. Pero si se usa principalmente para reemplazar personas, producir contenido basura, automatizar publicidad infinita y aumentar consumo sin medir impacto, deja de ser progreso y empieza a parecer extracción de recursos.
No estamos listos para darle control completo de nuestros recursos a sistemas de IA.

No estamos listos para una IA que crezca más rápido que la infraestructura eléctrica, las regulaciones y la capacidad ambiental del planeta.
Por eso creo que necesitamos una conversación más seria sobre:

  • Medición obligatoria de energía, agua y emisiones por data center.
  • Modelos grandes solo cuando realmente sean necesarios.
  • Modelos pequeños y especializados para tareas concretas.
  • Reportes claros de costos e ingresos reales de productos de IA.
  • Límites regionales donde la red eléctrica o el agua no alcancen.
  • Auditorías externas de impacto ambiental y económico.
  • Regulación antes de aprobar nuevas cargas gigantes de infraestructura.
    La pregunta no debería ser solo:

“¿Qué puede hacer la IA?”

La pregunta más importante debería ser:

“¿Qué costo estamos dispuestos a pagar como sociedad para usarla sin límites?”

La IA puede ser parte del futuro, sí.

Pero no debería convertirse en una excusa para consumir energía, agua, talento, dinero e infraestructura sin control.

IA sí, pero con límites.
IA útil, no IA desbordada.
IA supervisada, no IA dueña de nuestros recursos.

Feature Flags That Actually Ship: Lessons From the Trenches

It was 2:47 AM when the alerts started. A seemingly straightforward database migration had triggered a cascading failure across three downstream services, and our payment processing pipeline was dropping roughly 12% of transactions. The on-call engineer didn’t need to wake anyone, locate a rollback script, or wait for a CI pipeline to churn through another deploy. She opened the LaunchDarkly dashboard, toggled one kill switch, and the system reverted to the stable path within seconds. The migration was still there, still deployed — just no longer live.

That moment crystallized something I’d been learning across two and a half decades of building software: separating deployment from release isn’t a nice-to-have. It’s the difference between a system you trust and one you fear touching on a Friday afternoon.

This article captures what I’ve learned using feature flags in production — the patterns that held up under pressure, the mistakes I’ve watched teams repeat (and made myself), and the practical steps you can take whether you’re evaluating LaunchDarkly or already deep into your feature flag journey. I’m publishing this here first because the developer community gives the most honest feedback, and I’d rather refine these ideas with you before they land on LeadDev and DZone.

The Patterns That Actually Matter

When you first start with feature flags, everything looks like a toggle. The key consideration here is understanding that not all flags serve the same purpose, and conflating them creates the very fragility you’re trying to avoid.

Release Flags

These gate unfinished features. They’re temporary by design — the flag exists while the feature stabilizes, then gets removed. The mistake I see most often is teams treating release flags as permanent configuration knobs. When a flag has been at 100% for three months, nobody remembers which code path is the “real” one, and your test matrix silently doubles.

In practice, this means setting a removal date the moment you create the flag. Our team attaches an expiration tag to every release flag and runs a weekly script that surfaces anything past its removal window. We borrowed from the FlagShark playbook here: flags older than 90 days that aren’t operational kill switches get an automatic ticket filed.

Centralize your flag keys in a single file, it gives you a one-glance inventory and prevents the typo-driven debugging sessions that scattered string literals create:

// code/src/flags.js — single source of truth for all flag keys
// See companion project: code/src/flags.js

const FLAGS = {
  // Kill switch: wraps the payment provider integration.
  // Defaults to FALSE (safe path) if SDK is unreachable.
  PAYMENT_PROVIDER_KILL_SWITCH: "ops_payments_new_provider",

  // Release flag: gates the new checkout UI.
  // Temporary — remove after 100% rollout + 14 days stable.
  NEW_CHECKOUT_UI: "release_checkout_redesigned_ui",

  // Experiment flag: percentage rollout of recommendation engine.
  RECOMMENDATION_ENGINE: "experiment_recommendations_v2",

  // Permission flag: enterprise-only feature.
  ENTERPRISE_ANALYTICS: "permission_enterprise_analytics",
};

The naming convention follows a pattern: {type}_{team/domain}_{feature}_{detail}. This tells you at a glance what a flag does, who owns it, and when it should be removed. Release flags should be short-lived. Ops flags (kill switches) should be reviewed annually. Experiment flags expire when the experiment ends.

Here’s the LaunchDarkly client initialization — a singleton that streams flag rules and caches them locally so evaluations work even during network interruptions:

// code/src/launchdarkly.js — LD client singleton
// See companion project: code/src/launchdarkly.js

const LaunchDarkly = require("@launchdarkly/node-server-sdk");

async function initLaunchDarkly(sdkKey) {
  const ldClient = LaunchDarkly.init(sdkKey);

  try {
    await ldClient.waitForInitialization({ timeout: 5 });
    console.log("[LaunchDarkly] Client initialized successfully");
  } catch (err) {
    console.warn(
      "[LaunchDarkly] Initialization timed out — operating from cache or defaults"
    );
  }

  return ldClient;
}

Kill Switches

A kill switch is a different animal entirely. It’s not about shipping features — it’s about operational safety. Every integration point with an external system, every experimental code path, every performance-sensitive refactor gets wrapped in one.

The pattern that saved us at 2:47 AM looked like this:

// code/src/server.js — Kill Switch pattern
// See companion project: code/src/server.js, GET /api/payment/status

app.get("/api/payment/status", async (req, res) => {
  const context = { kind: "user", key: req.query.user || req.ip };

  // Default: false = use safe fallback path.
  // If LaunchDarkly is unreachable, the SDK returns the default.
  const useNewProvider = await client.boolVariation(
    FLAGS.PAYMENT_PROVIDER_KILL_SWITCH,
    context,
    false   // <-- THE CRITICAL DEFAULT: safe path
  );

  if (useNewProvider) {
    return res.json({ provider: "new-payment-provider", status: "ok" });
  }

  // Safe fallback: the existing, battle-tested provider.
  res.json({ provider: "existing-payment-provider", status: "ok" });
});

The critical design requirement: the fallback path must be the one that works. If your kill switch guards a new payment provider integration, the fallback routes through the existing, battle-tested provider. If the flag evaluation itself fails due to a network issue, LaunchDarkly’s SDK returns the default value you specify — which should always trigger the safe path.

Percentage Rollouts

Deterministic hashing based on a stable user attribute means the same user sees the same experience across sessions. This matters more than you’d think — users notice inconsistency, and your metrics become meaningless if a single user bounces between variants.

Our rollout cadence settled into a rhythm: internal team for one day, 1% of external users for a day, then 5%, 25%, and full release if all guardrails stay green. At each stage, we watch application error rates, API latency, and business metrics. LaunchDarkly’s Guarded Releases can automate the pause-or-rollback decision if a threshold breaches, which removes the 3 AM judgment call from the equation.

// code/src/server.js — Percentage rollout with string variation
// See companion project: code/src/server.js, GET /api/recommendations

app.get("/api/recommendations", async (req, res) => {
  const context = { kind: "user", key: req.query.user || "anonymous" };

  // stringVariation for multi-variant experiments.
  // Deterministic hashing on user key ensures the same user
  // consistently sees the same variant.
  const variant = await client.stringVariation(
    FLAGS.RECOMMENDATION_ENGINE,
    context,
    "v1"   // default: existing recommendation engine
  );

  if (variant === "v2") {
    return res.json({
      engine: "collaborative-filtering-v2",
      recommendations: ["Item-A", "Item-B", "Item-C"],
    });
  }

  res.json({
    engine: "popularity-based-v1",
    recommendations: ["Item-X", "Item-Y", "Item-Z"],
  });
});

And here’s user targeting in action — enterprise features gated by a custom attribute:

// code/src/server.js — Targeting with custom attributes
// See companion project: code/src/server.js, GET /api/analytics/dashboard

app.get("/api/analytics/dashboard", async (req, res) => {
  const context = {
    kind: "user",
    key: req.query.user || "anonymous",
    plan: req.query.plan || "free",  // custom attribute for targeting rules
  };

  const canAccess = await client.boolVariation(
    FLAGS.ENTERPRISE_ANALYTICS,
    context,
    false
  );

  if (!canAccess) {
    return res.status(403).json({
      error: "Enterprise analytics require the Enterprise plan.",
    });
  }

  res.json({
    dashboard: "advanced-analytics",
    metrics: ["revenue-per-user", "churn-prediction", "cohort-retention"],
  });
});

All the code above comes from the companion project — a fully runnable Express app in code/src/server.js. Clone it, set your SDK key, and you’ll see every pattern respond to flag toggles in real time without a server restart.

The Questions Your Team Will Ask (And How to Answer Them)

When you introduce feature flags at scale, you’ll hear the same objections. I’ve had these conversations enough times to recognize the patterns.

“Doesn’t this just create more code to maintain?”

Yes, if you treat flags as permanent. The entire discipline of flag lifecycle management exists because flags without expiration dates become technical debt with a feature flag logo. The countermeasure is mechanical, not cultural: automation that flags stale toggles, creates cleanup tasks, and blocks new flags when the ratio of creation to removal tips past 2:1.

We enforce a simple rule: every flag has an owner, an expiration date, and a ticket filed at creation time for its eventual removal. When a release flag hits 100% rollout for two weeks, the cleanup PR gets auto-generated. This isn’t optional — it’s how you prevent the flag graveyard.

“What if the flag service goes down?”

LaunchDarkly SDKs maintain a streaming connection and cache flag rules locally. If the connection drops, evaluations continue against the cached ruleset. The boolVariation call includes a default value parameter precisely for this scenario — and every code path I write defaults to the safe, existing behavior.

In the 2:47 AM scenario, the kill switch worked because the SDK had already cached the flag state. Even if LaunchDarkly’s service had been unavailable at that exact moment, the toggle would have still evaluated correctly against the local cache.

“Can’t we just build this ourselves?”

Technically, yes. I’ve seen teams build internal feature flag systems. I’ve also seen those same teams spend sprint after sprint maintaining edge-case evaluation logic, building dashboards, and debugging deterministic hashing when they could have been building their actual product. The key consideration here isn’t whether you can build it — it’s whether maintaining a feature flag platform is where your team’s time creates the most value.

Where We Go From Here

If you’re starting with feature flags, begin with one operational kill switch on a high-risk integration. Get comfortable with the pattern, build the muscle memory for flag cleanup, then expand to release flags and progressive rollouts. The most successful adoptions I’ve seen started small and grew organically, rather than attempting a company-wide flag-everything initiative overnight.

For deeper dives, the LaunchDarkly documentation on guarded rollouts and kill switch flags is excellent. The FlagShark best practices guide informed much of our internal naming and lifecycle discipline. And if you want to understand why stale flags genuinely keep me up at night, read about the $460M Knight Capital incident — a stark reminder that unreachable code paths aren’t harmless.

The original version of this article, along with a companion project demonstrating every pattern discussed here, lives on this blog. I’ll be expanding it based on your questions and feedback before it goes to LeadDev and DZone — so if something here sparks a thought or a disagreement, I’d genuinely like to hear it in the comments.

Key Takeaways

Separate deployment from release. A deployed change that isn’t live yet is a safety net. A deployed change that’s fully live with no way to turn it off is a liability.

Treat flag cleanup as a first-class engineering practice. Naming conventions, expiration dates, and automated removal aren’t overhead — they’re what keep your codebase comprehensible six months from now.

Default to safety. Every flag evaluation should fall back to the known-good path. The time to verify your kill switch works isn’t during an incident at 2:47 AM.

Start small, automate early, and build the habits before you build the flag count. The teams I’ve watched succeed with feature flags aren’t the ones with the most sophisticated tooling — they’re the ones with the most disciplined lifecycle management.

Logic Apps Agent Loop + MCP: Two Bugs Worth Knowing About

I spent the long weekend pushing Logic Apps MCP server capabilities further than I had before — and hit two bugs worth documenting. Both are filed. If you’re building in this space, save yourself the debugging time.

Context

If you’ve been following along, the MCP server and BODMAS Agent are covered in the previous posts. This post is just about what broke when I wired them together.

Bug 1 — Intermittent duplicate key error at tool registration

What happens

The Agent Loop fails with a BadRequest before making a single MCP call:

HTTP request failed: 'An item with the same key has already been added. Key: {tool_name}'.

The key referenced in the error — BasicArithmeticMCP, ExtendedArithmeticMCP, whatever you name it — appears exactly once in the workflow definition. There is no actual duplicate in the JSON.

What makes it particularly frustrating to diagnose

It is intermittent. Some runs fail, others succeed with identical configuration and identical input. No changes between a failing and a succeeding run — same workflow, same expression, same everything.

Load test

I fired 5 to 10 parallel requests at the Agent Loop as a mini stress test. It failed — the duplicate key error appeared across multiple runs in the batch.

Sequential calls with proper spacing between them worked fine.

What you can’t do

The Agent action has a default retry policy, but it does not help here. A BadRequest (400) is not treated as a transient error — the retry policy targets server-side failures (5xx), not client errors. So even with retries configured, the duplicate key error causes an immediate terminal failure. There is no clean in-workflow workaround.

Bug 2 — MCP Connector does not support OAuth

What happens

Both the MCP server and the MCP client are Logic Apps Standard. When OAuth is configured on the MCP server side, the workflow doesn’t trigger at all — it never reaches the Logic App. The connection gets corrupted at design time with the OAuth setup, and no run is created.

Tools don’t load but you can save the workflow.

You get a 502 bad gateway error when you push a request.

The same endpoint called directly from Postman with a valid bearer token works fine.

Why it matters

To get the Agent Loop working, the MCP server has to run with either anonymous authentication or key-based authentication. OAuth simply does not work with the built-in MCP client connector.

Current state

Both issues are filed on the Logic Apps GitHub repo:

Agent Loop: “An item with the same key has already been added” when using McpClientTool

The issue covers both bugs with full workflow JSON, reproduction steps, and screenshots. If you’ve hit either of these, add a reaction or comment — the more signal on the issue, the better.

What works in the meantime

  • Set "type": "anonymous" in the McpServerEndpoints authentication block in host.json — removes the OAuth blocker for dev and demo use
  • Accept the intermittent failure rate on the Agent Loop and re-trigger manually when it hits — not a fix, but the success rate is high enough to keep building and testing

Both issues are filed. If you hit either of them, the GitHub issue is the right place to add signal.

Mythos Found a 27-Year-Old Bug in OpenBSD. Your Code Is Next.

Anthropic’s new Mythos Preview surfaced a 27-year-old vulnerability in OpenBSD — the most-audited operating system in commercial software — and generated 181 working Firefox exploits in a benchmark where Claude Opus 4.6 managed two. Eleven organizations are inside the launch cohort. The rest of us aren’t, and the next Mythos won’t be gated.

What Mythos is, in hard numbers

On April 7, Anthropic announced Claude Mythos Preview, a frontier general-purpose model with a step-change in computer security capability. The numbers are the story:

  • A 27-year-old vulnerability in OpenBSD, surfaced by Mythos in the TCP SACK implementation. OpenBSD’s audit posture is the high bar in the industry.
  • A 16-year-old vulnerability in FFmpeg’s H.264 codec — the media component shipped in nearly every modern browser and video pipeline.
  • A 17-year-old remote code execution vulnerability in FreeBSD’s NFS implementation (CVE-2026-4747).
  • Linux kernel vulnerabilities autonomously chained by the model into a complete privilege escalation to root.
  • 181 working Firefox exploits in a benchmark where Claude Opus 4.6 produced two — an order-of-magnitude leap in a single model generation.
  • 271 vulnerabilities patched in Firefox 150 after Mozilla used an early version of Mythos Preview to scan its codebase. Mozilla described the model as “every bit as capable” as the best human security researchers.
  • Thousands of zero-days identified in operating systems, browsers, and infrastructure software in the weeks before announcement.

Anthropic was clear about something else worth dwelling on: the company did not explicitly train Mythos for these capabilities. They emerged as a downstream consequence of general improvements in code, reasoning, and autonomy. The same improvements that make the model a better defender make it a better attacker. That equivalence is the whole story.

Mythos isn’t a security tool. It’s a frontier model that happens to be very good at a security task that turns out to require general intelligence. The distinction matters: capability of this kind doesn’t stay siloed.

The asymmetry just collapsed

For thirty years, the offensive-defensive asymmetry in software security was: attackers needed to find one bug, defenders needed to find all of them. The economics favored attackers — but only because finding bugs was hard, slow, and required deep human expertise.

Mythos didn’t flip the asymmetry. It collapsed the cost difference between the two activities. The same model that can find thousands of zero-days for a defender can find thousands of zero-days for an attacker. There is no “attacker mode” and “defender mode.” There is one capability with two uses, and the user picks.

For the launch cohort inside Project Glasswing — including Microsoft, Google, Apple, AWS, JPMorganChase, Nvidia, the Linux Foundation, and major security vendors — this is a defensive windfall. They get to find and patch their own bugs before anyone else can. For everyone else, the math is uglier. When this class of capability becomes broadly available (and it will), the same scan that takes Apple a quiet weekend will take a determined adversary the same quiet weekend.

What this changes about threat modeling

Pre-Mythos, the assumption underlying most enterprise risk frameworks was that vulnerabilities cost time to discover. Post-Mythos, that assumption no longer holds for sophisticated actors. The vulnerabilities are already there, in code that’s already deployed. The only question is who finds them first.

Project Glasswing’s narrow gate

Anthropic’s response to the dual-use problem is Project Glasswing: instead of releasing Mythos publicly, the model is gated to vetted partners doing defensive security work on critical infrastructure. The launch cohort is eleven outside organizations — AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks — with another forty-plus organizations given extended access. Anthropic has committed $100M in Mythos usage credits and additional funding to upstream open-source security ($2.5M to Alpha-Omega and OpenSSF, $1.5M to the Apache Software Foundation). On April 21, Bloomberg and TechCrunch reported that a small group of unauthorized users — reportedly a third-party Anthropic contractor who guessed the model’s online location — had accessed Mythos on the same day Anthropic announced the limited release.

The Glasswing structure is a reasonable response to a hard problem. The cohort is a serious set of defenders, the Linux Foundation’s inclusion broadens the open-source impact, and the upstream funding commitments are not trivial. But the structure has implications worth thinking through:

  • The launch cohort is well-resourced and concentrated. Megacaps, major security vendors, and one open-source foundation. Most enterprises, healthcare systems, utilities, and government agencies are not in the launch cohort.
  • The cohort is the world’s biggest target. Concentrating frontier offensive capability inside a known list of well-resourced firms makes those firms exponentially more valuable to compromise. The April 21 unauthorized-access incident is the canary, not the bird.
  • The gate is temporary. The capability emerged from general intelligence improvements. Other labs are on the same trajectory. Within twelve to twenty-four months, equivalent capability will be available somewhere — through a competitor, an open-weights model, or a leak. Anthropic’s caution buys the industry time. It does not buy the industry safety.
  • The defenders inside the gate have a head start. The defenders outside the gate don’t. By the time Mythos-class capability is broadly available, the cohort will have spent a year hardening their stacks. Everyone else will be starting cold.

None of this is criticism of Glasswing. It’s a description of where the rest of the industry sits: outside the gate, on the clock, with a year-or-so head start to spend on infrastructure that doesn’t assume bug discovery is expensive.

Why your legacy stack is the easy target

If Mythos found a bug in OpenBSD that survived twenty-seven years of obsessive auditing, what does it find in code that’s been quietly running in production since 1998 with no audit at all?

Legacy systems are uniquely exposed to this class of capability for reasons that have nothing to do with their original quality:

  • The code was written in a different threat model. COBOL batch jobs, C-based middleware, and FORTRAN scientific computing were written assuming network isolation, trusted operators, and small adversary budgets. None of those assumptions hold today.
  • The maintainers are gone. The engineers who wrote the original code retired a decade ago. The people who maintain it now read it; they don’t reason about it. A capable adversary scanning the same code reasons about it just fine.
  • The scale is enormous. A typical Fortune 100 enterprise runs millions of lines of legacy code. Manual audit is impossible at this volume; automated tools were built for the threat model where bug discovery was expensive. Mythos-class capability inverts that economics.
  • The code is statistically interesting. Old code has been running long enough that bugs which never triggered in production are still latent. The defects are there. They just haven’t been found yet.
  • The patch path is brittle. Even when a bug is found in a legacy system, the cost of patching is often catastrophic — recompiling a forty-year-old build chain, validating against a forty-year-old behavior contract, regression-testing dependencies that may no longer have maintainers. “We can’t patch this” is a common honest answer for legacy systems, and adversaries know it.

The 27-year-old OpenBSD bug is the canary. OpenBSD is among the most-audited code in the world. Your COBOL payroll system, your FORTRAN actuarial engine, your C-based supply chain ETL — they have not had that audit. They have the same age. They do not have the same hardening.

The honest framing is this: Mythos-class capability does not introduce new vulnerabilities. It surfaces vulnerabilities that have been latent in your systems for years or decades. The defects are already there. The economics of finding them just changed.

The defender’s playbook for the next 90 days

If we accept that Mythos-class capability will be broadly available within twenty-four months and that legacy systems are the most exposed surface, the defensive question is what to do this quarter that materially reduces risk. Five things worth prioritizing.

1. Get an honest inventory of your legacy attack surface

Most enterprises do not have an accurate inventory of what legacy code they actually run, what it touches, and what depends on it. The first step is unglamorous: catalog the legacy systems, their network exposure, the data they process, and the dependencies that would break if they went down. You cannot defend what you cannot see.

2. Build the SBOM you should already have

A Software Bill of Materials isn’t a compliance artifact; it’s the data structure you need to answer the question “is the new zero-day in our stack?” in minutes instead of weeks. Federal contractors will need one for compliance under recent OMB guidance. Build it now, before the next Mythos disclosure forces the question.

3. Modernize the highest-exposure legacy primitives first

Total legacy modernization is a multi-year program. Prioritized modernization isn’t. Identify the legacy components with (a) network exposure, (b) sensitive data flow, and (c) no maintainer — and modernize those first. Pull the C-based parser out of the perimeter. Replace the COBOL service that processes external data with a memory-safe equivalent. Leave the back-office batch job for next year.

4. Assume the patch tsunami is coming

If Mythos-class scanning produces ten thousand findings against your stack, your security team cannot triage ten thousand findings by hand. Invest in automated patch prioritization, exploit-prediction scoring (EPSS), and patch-deployment automation now — before you need it under pressure. The bottleneck of the next two years is not finding bugs. It’s deciding which ones to patch first and shipping the patches without breaking production.

5. Threat-model with AI-assisted attackers in scope

Update your threat models to assume adversaries have Mythos-class capability. The questions change. “What’s our mean-time-to-detect?” matters more than “Is this code vulnerable?” (it almost certainly is). “What’s the blast radius if a single legacy primitive is fully compromised?” matters more than “Is this primitive likely to be compromised?” (it is more likely than it was). Defense in depth, network segmentation, and rapid containment become first-class controls, not best-practice nice-to-haves.

The shift in posture

Pre-Mythos: defenders optimize for bug-finding cost. Post-Mythos: defenders optimize for time-to-patch and blast-radius containment, because bugs will be found whether you find them first or someone else does.

A note for federal contractors

Federal contractors and agencies have an extra layer of implications: the procurement and compliance machinery that governs federal software is going to reckon with this — slowly, but inexorably. Expect SBOM and provenance requirements (already mandated under EO 14028) to get enforced in earnest. Expect NIST SSDF / SP 800-218 to shift from documentation to continuous attestation. Expect legacy waivers to become harder to defend, with risk-acceptance memos required to explicitly acknowledge Mythos-class threat. Expect patch SLAs to compress — sub-week response on high-severity findings against widely-deployed primitives is the realistic floor, not the ceiling. Vendor due-diligence will move from annual questionnaires to continuous attestation.

The realistic posture for the next twenty-four months is not “modernize everything.” It is “modernize the exposed surface, instrument the rest, and assume the rest will eventually be reached.” The agencies and primes that prepare for that reality now will not be the ones writing breach-notification letters in 2027.

The honest read

Mythos is not a doomsday model. It is a step on a curve that the entire industry has been on for several years, and Anthropic’s decision to gate it through Glasswing is, in our view, the responsible move. We don’t think the right reaction is panic, and we don’t think the right reaction is dismissal.

The right reaction is to use the Glasswing window — the twelve to twenty-four months where this capability is concentrated in twelve hands and a national-security agency — to do the unglamorous defensive work that everyone has been deferring. Inventory the legacy. Build the SBOM. Modernize the exposed primitives. Automate the patch path. Threat-model with AI-assisted attackers in scope.

We don’t know exactly when the next Mythos lands or who ships it. We do know it will not be gated like this one. The defenders who used the window will be fine. The defenders who didn’t will be writing the postmortem.

Codavyn helps enterprise and federal teams modernize the exposed surface of legacy stacks before AI-assisted scanning catches up. Custom software, modernization, and a threat model that assumes the attacker is reading your code as fast as you are. See our modernization services or book a 30-minute risk review.

How to Prevent IDOR Vulnerabilities in Django REST APIs

How to Prevent IDOR Vulnerabilities in Django REST APIs

An authenticated user changes /api/orders/42/ to /api/orders/43/ and reads someone else’s order. No privilege escalation needed — the endpoint just returns it. This is IDOR in its simplest form, and it’s endemic in Django REST Framework code because DRF makes it trivially easy to wire up a ModelViewSet that exposes every object in a table. The authentication layer does its job; the authorization layer was never written.

How IDOR Attacks Work Against Django REST APIs

IDOR (Insecure Direct Object Reference) happens when an API accepts a user-controlled identifier — a URL path segment, query param, or request body field — and retrieves the corresponding object without verifying that the requesting user has any right to it. Authentication proves who you are. Authorization proves what you can touch. Most IDOR bugs exist because the first check was implemented and the second was skipped.

A typical attack against a vulnerable DRF app:

  1. Attacker authenticates as alice@example.com and creates an order. The response contains {"id": 101, ...}.
  2. Attacker sends GET /api/orders/100/. The API returns Bob’s order because nothing checks ownership.
  3. Attacker scripts a loop from ID 1 to 10000, dumps every order in the database. Sequential integer PKs make enumeration take seconds.

Here is the vulnerable ViewSet pattern we see most often in real codebases:

# views.py — VULNERABLE
from rest_framework import viewsets
from rest_framework.permissions import IsAuthenticated
from .models import Order
from .serializers import OrderSerializer

class OrderViewSet(viewsets.ModelViewSet):
    serializer_class = OrderSerializer
    permission_classes = [IsAuthenticated]  # proves identity, not ownership

    def get_queryset(self):
        # Returns every order in the database — any authenticated user
        # can retrieve, update, or delete any order by guessing its PK.
        return Order.objects.all()

IsAuthenticated blocks anonymous requests, which makes it look like the endpoint is secured. But any valid session token — including one the attacker registered themselves — bypasses it. The retrieve(), update(), and destroy() actions in ModelViewSet all call get_object(), which calls get_queryset() and then filters by the URL pk. Since get_queryset() returns everything, get_object() happily resolves any ID.

Fixing IDOR by Scoping Querysets to the Authenticated User

The correct fix is to scope get_queryset() to the authenticated user so that the object simply doesn’t exist from the API’s perspective if it doesn’t belong to the requester. This gives you a 404 instead of a 403, which is almost always the right behavior — a 403 confirms the resource exists and leaks information about the ID space.

Add a second layer with a custom BasePermission that implements has_object_permission. The queryset filter handles list and retrieve; the object permission handles mutating actions where DRF calls check_object_permissions explicitly.

# permissions.py
from rest_framework.permissions import BasePermission

class IsOwner(BasePermission):
    def has_object_permission(self, request, view, obj):
        # Explicit ownership check — queryset scoping is the first line,
        # but we defend in depth for any path that bypasses get_queryset.
        return obj.owner == request.user
# views.py — FIXED
from rest_framework import viewsets
from rest_framework.permissions import IsAuthenticated
from .models import Order
from .serializers import OrderSerializer
from .permissions import IsOwner

class OrderViewSet(viewsets.ModelViewSet):
    serializer_class = OrderSerializer
    permission_classes = [IsAuthenticated, IsOwner]

    def get_queryset(self):
        # Scope to the requesting user at the ORM layer — objects that don't
        # belong to this user never enter the retrieval pipeline at all.
        return Order.objects.filter(owner=self.request.user).select_related("owner")

    def perform_create(self, serializer):
        # Bind the new object to the authenticated user so the POST path
        # can't accept a user-controlled owner field.
        serializer.save(owner=self.request.user)

Filtering at the queryset layer beats checking IDs inside the view body for two reasons. First, it’s impossible to forget: every action — list, retrieve, update, partial update, destroy — goes through get_queryset(). Second, it eliminates a whole class of time-of-check / time-of-use bugs where you check ownership in get but forget to re-check in patch.

The same defense-in-depth principle applies to object-level auth in gRPC services and any RPC-style API where the framework doesn’t give you a queryset abstraction: filter first, check permissions on the resolved object second.

Use Unguessable Identifiers Instead of Sequential IDs

Sequential integer PKs are an enumeration gift. Once an attacker has one valid ID, they have a roadmap to every other record. Replacing exposed identifiers with UUIDs or opaque slugs doesn’t fix the authorization hole — that requires the fixes above — but it raises the cost of bulk enumeration from “write a loop” to “brute-force a 128-bit space.”

# models.py
import uuid
from django.db import models

class Order(models.Model):
    # Use UUIDField as the primary key to prevent sequential enumeration.
    # This is defense in depth — queryset scoping is still mandatory.
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    owner = models.ForeignKey(
        "auth.User", on_delete=models.CASCADE, related_name="orders"
    )
    total = models.DecimalField(max_digits=10, decimal_places=2)
    created_at = models.DateTimeField(auto_now_add=True)
# urls.py — router uses the UUID field as the lookup
from rest_framework.routers import DefaultRouter
from .views import OrderViewSet

router = DefaultRouter()
router.register(r"orders", OrderViewSet, basename="order")

# Override lookup_field on the ViewSet to match the UUID primary key
# so DRF resolves /api/orders/<uuid>/ instead of /api/orders/<int>/
# views.py addition
class OrderViewSet(viewsets.ModelViewSet):
    lookup_field = "id"  # matches the UUIDField name on the model
    # ... rest of ViewSet unchanged from the fix above

One tradeoff: UUIDs inflate index size and can slow joins on large tables. If that matters, use a separately-stored public_id = models.UUIDField(default=uuid.uuid4, editable=False, unique=True) alongside an integer PK, and expose only public_id in serializers and URLs. The internal integer PK never appears in any HTTP response.

Never treat opaque IDs as a substitute for proper authorization. We’ve reviewed APIs that switched to UUIDs, removed the queryset scoping because “users can’t guess them now,” and then leaked UUIDs in webhook payloads, browser history, or third-party analytics — instantly making every ID known to an attacker.

Enforce Authorization at the Serializer and Nested Resource Level

Queryset scoping protects URL-path-based access. IDOR also hides in writable foreign key fields where a user submits a payload referencing another tenant’s object. A user who owns projects 10 and 11 might try {"project": 99} on a task creation endpoint to attach their task to someone else’s project.

This is especially common in multi-tenant SaaS applications where related resources belong to different organizational boundaries.

# serializers.py
from rest_framework import serializers
from .models import Task, Project

class TaskSerializer(serializers.ModelSerializer):
    class Meta:
        model = Task
        fields = ["id", "title", "project", "due_date"]

    def validate_project(self, value):
        request = self.context.get("request")
        if request is None:
            raise serializers.ValidationError("No request context available.")

        # Reject foreign keys that don't belong to the authenticated user —
        # without this check, any user can write into any project by ID.
        if not Project.objects.filter(id=value.id, owner=request.user).exists():
            raise serializers.ValidationError(
                "Project not found."  # Deliberately vague — don't confirm existence
            )
        return value

Always pass request in serializer context. DRF does this automatically when you use get_serializer() inside a view, but if you instantiate serializers directly (in management commands, signals, or background tasks), you must pass context={"request": request} manually. When there’s no request context at all — background jobs, for example — you need a different mechanism to establish the authorization boundary, typically passing the owner explicitly.

The same class of bug appears in writable nested serializers. If a LineItem serializer accepts a nested order object with an id field, a user can point that id at any order. Validate every inbound relation. For more on how this nesting problem scales, the same concepts appear in authorization patterns in GraphQL APIs, where every resolver is effectively a relation that needs its own ownership check.

Test for IDOR with Automated Authorization Checks

The only reliable way to prevent IDOR regressions is to write tests that explicitly attempt cross-user access and assert they fail. Code reviews miss it. Manual QA misses it. Tests that authenticate as user B and try to touch user A’s resources catch it every time — if you write them.

# tests/test_order_idor.py
import pytest
from django.contrib.auth import get_user_model
from rest_framework.test import APIClient
from orders.models import Order

User = get_user_model()

@pytest.fixture
def alice(db):
    return User.objects.create_user(username="alice", password="testpass123")  # noqa: S106

@pytest.fixture
def bob(db):
    return User.objects.create_user(username="bob", password="testpass123")  # noqa: S106

@pytest.fixture
def alice_order(alice):
    return Order.objects.create(owner=alice, total="99.99")

@pytest.mark.django_db
class TestOrderIDOR:
    def _client_for(self, user):
        client = APIClient()
        client.force_authenticate(user=user)
        return client

    def test_bob_cannot_retrieve_alice_order(self, alice_order, bob):
        # 404, not 403 — we don't confirm the resource exists to unauthorized users.
        response = self._client_for(bob).get(f"/api/orders/{alice_order.id}/")
        assert response.status_code == 404

    def test_bob_cannot_update_alice_order(self, alice_order, bob):
        response = self._client_for(bob).patch(
            f"/api/orders/{alice_order.id}/", {"total": "0.01"}, format="json"
        )
        assert response.status_code == 404

    def test_bob_cannot_delete_alice_order(self, alice_order, bob):
        response = self._client_for(bob).delete(f"/api/orders/{alice_order.id}/")
        assert response.status_code == 404

    def test_bob_list_does_not_include_alice_order(self, alice_order, bob):
        # List endpoint must not leak cross-user data even if IDs are unknown.
        response = self._client_for(bob).get("/api/orders/")
        assert response.status_code == 200
        ids = [item["id"] for item in response.data["results"]]
        assert str(alice_order.id) not in ids

The list-endpoint test is easy to forget and catches a different bug: get_queryset() returning everything on list() but correctly filtering on retrieve(). Write both.

Wire these into CI as required checks. A failing IDOR test should block a merge the same way a failing unit test does. This is not optional — the whole point is that a developer adding a new ModelViewSet in a Friday pull request doesn’t ship a data leak to production by Monday.

Catch IDOR in Code Review and CI

Human review of pull requests should pattern-match on a short list of high-risk constructs. Any Model.objects.get(pk=...) or Model.objects.filter(id=...) call that doesn’t chain a user-scoping filter is a candidate IDOR. Any ViewSet missing permission_classes is an unauthenticated endpoint or is inheriting from a base class that may not have adequate defaults. Any serializer field of type PrimaryKeyRelatedField with a broad queryset is a potential cross-tenant write.

Automate this with Semgrep. Here is a rule that flags the most common pattern: a DRF view calling .objects.get() without an owner filter anywhere in the same expression:

# semgrep/rules/drf-idor.yml
rules:
  - id: drf-unscoped-objects-get
    patterns:
      - pattern: $MODEL.objects.get(pk=...)
      - pattern-not: $MODEL.objects.get(pk=..., owner=...)
      - pattern-not: $MODEL.objects.get(pk=..., owner__in=...)
    message: >
      Unscoped .objects.get(pk=...) in a view — add an owner filter or replace with
      a queryset scoped in get_queryset(). Risk: IDOR.
    languages: [python]
    severity: ERROR
    metadata:
      cwe: CWE-639

Run this rule in your CI pipeline on every pull request. To shift IDOR checks left in your CI/CD pipeline, add it as a required status check alongside your test suite — not a separate “security scan” that developers learn to ignore.

Code review checklist for IDOR-prone patterns:

  • ModelViewSet or GenericAPIView subclass with no explicit get_queryset override — check what the default queryset returns.
  • permission_classes = [] or a ViewSet that inherits permission_classes from a base class you don’t control.
  • PrimaryKeyRelatedField(queryset=Model.objects.all()) in any writable serializer — this gives any user access to the full table.
  • perform_create or perform_update that doesn’t pin the owner field, leaving it open to user-supplied values.
  • Tests that only assert status_code == 200 for the happy path, with no cross-user negative test.

SAST tools like Semgrep will catch structural patterns; they won’t catch logic bugs where the filter is present but uses the wrong field. Code review has to cover that gap. The combination — automated rules catching the obvious omissions, human review focused on logic — is more effective than either alone.

Hardening Checklist and Next Steps

The layered controls, in priority order:

Queryset scoping (required): get_queryset() filters by request.user. No exceptions for convenience. If an admin view needs to return all objects, it lives in a separate ViewSet with explicit admin permission checks.

Object-level permissions (required): IsOwner or equivalent BasePermission with has_object_permission as a second line of defense. Attach it to every mutating ViewSet.

Serializer-level FK validation (required for relational writes): Every PrimaryKeyRelatedField or nested writable serializer validates that the referenced object belongs to request.user.

perform_create owner binding (required): Never accept owner from request data. Always call serializer.save(owner=self.request.user).

Opaque identifiers (defense in depth): UUIDs or opaque public IDs in all URLs and serializer output. Still mandatory to have the above controls in place.

Automated cross-user tests (required for CI gates): One test class per resource that authenticates as User B and asserts 404 on User A’s list, retrieve, update, and delete endpoints.

SAST rules in CI (defense in depth): Semgrep rules flagging unscoped .objects.get() and missing permission_classes, run as required checks on pull requests.

These controls address the majority of IDOR patterns in DRF, but authorization bugs extend well beyond the patterns covered here. If you want to build systematic habits around authorization review — across frameworks, auth protocols, and API types — the Application Security Engineer learning path on Code Review Lab covers the full scope, including scenarios more complex than single-tenant ownership checks.

The part most teams skip is the test suite. You can write perfect queryset scoping today and watch a future contributor add a get_object_or_404(Order, pk=pk) shortcut that bypasses it entirely. Tests that authenticate as the wrong user and assert 404 are the only automated check that catches that regression. Write them now, gate CI on them, and review them alongside any new ViewSet. If you want a reference for how IDOR shows up in security interviews and assessments, common IDOR interview questions are a useful signal for the gaps engineers typically leave in production systems.

Further Reading

  • OWASP IDOR Prevention Cheat Sheet — authoritative guidance on access control patterns across frameworks.
  • CWE-639: Authorization Bypass Through User-Controlled Key — the formal taxonomy entry with real-world consequences and detection guidance.
  • Django REST Framework: Permissions — official DRF docs on has_permission and has_object_permission, including check_object_permissions call semantics.
  • Application Security Engineer learning path on Code Review Lab — structured curriculum for building authorization review skills across multiple API paradigms.
  • PortSwigger Web Security Academy: IDOR — interactive labs that demonstrate enumeration, parameter tampering, and horizontal privilege escalation in concrete exercises.