[SC] Ejecutando Tasks en SwiftUI

¿Qué relación tiene el ejemplo de búsqueda con .task?

Se retoma el ejemplo de search(_ query: String) donde se filtran unas entradas pasado un rebote de 500ms. En este ejemplo, se necesita guardar una referencia de un Task para luego cancelarlo, en la clase ArticleSearcher (que es Observable). Más tarde, desde la vista se invoca el método search(_:) desde un onChange(of:).

Estado inicial de search(_ query: String)

@MainActor
@Observable
final class ArticleSearcher {

  private static let articleTitlesDatabase = [
    "Article one",
    "Article two",
    "Article three",
  ]

  var searchResults: [String] = ArticleSearcher.articleTitlesDatabase

  private var currentSearchTask: Task<Void, Never>?

  func search(_ query: String) {
    currentSearchTask?.cancel()
    currentSearchTask = Task {
      do {
        try await Task.sleep(for: .milliseconds(500))
        print("Starting to search!")
        searchResults = Self.articleTitlesDatabase.filter {
          $0.lowercased().contains(query.lowercased())
        }
      } catch {
        print("Search was cancelled!")
      }
    }
  }
}

¿Cómo funciona el modificador .task?

SwiftUI permite enganchar un Task al ciclo de vida de un View por medio de .task(). La tarea se programa antes de que aparezca la vista y se cancela cuando desaparezca.
Ya no es necesario envolver la tarea a disparar con Task { }, sino que basta con hacer async la firma del método.

Estado final de search(_ query: String)

func search(_ query: String) async {
  do {
    try await Task.sleep(for: .milliseconds(500))
    print("Starting to search!")
    searchResults = Self.articleTitlesDatabase
      .filter { $0.lowercased().contains(query.lowercased()) }
  } catch {
      print("Search was cancelled!")
  }
}

¿Para qué sirve el parámetro id: en .task?

.task() dispara una tarea antes de que aparezca una vista. Sin embargo, cuando el valor del parámetro id: cambia, se cancela la tarea anterior de forma automática y se vuelve a disparar una nueva.

¿Para que sirve el parámetro .priority en .task?

Como Swift Concurrency tiene un pool de hilos cooperativo, podemos cambiar la prioridad para obtener un comportamiento diferente con respecto a otras tareas. Sin embargo, tener en cuenta que SwiftUI usa por defecto userInitiated, así que, en principio, tiene sentido cambiar la prioridad a media (utility) o baja (background).

Bibliografía

  • Van der Lee, A. (2025). Swift Concurrency Course [Curso en línea]. avanderlee.com.

The “Vibe Coding” Reality Check: Are You Actually Engineering or Just Prompting? 🚀

The term “Vibe Coding” is often used by developers. It means that the developer uses their big AI tools in a way that they are “describing the vibes” of an app and are able to make large amounts of code instantaneously. This is great for getting the app to market, but it leaves open the possibility that code that looks right is not architecturally correct.

The Foundations vs. The Shortcuts 🧠

In the professional development process, how we deal with logic changes in steps:

The Syntax Phase:
This is where we learn the syntax of the language, how to write a for loop in Java, how to connect to Oracle SQL, etc. At this stage, the role of the AI is that of a super smart autocomplete.

The Architecture Phase:
This is where the real engineering takes place. It’s where we reason about why we prefer one way over another. If we have an app that’s super “slick” and “vibe-coded” but uses a transitive dependency that violates Database Normalisation, then the app remains shaky even though it looks nice on the outside.

Why Vibes Don’t Cut It in Production🏗️

The problem with relying solely on auto-generated code is that it does not reveal how it’s generated. While an executable code is essential in production, the real challenge lies in making it maintainable.

The Vibe trap is real. The tendency of AI code is to only work on the “happy path.” The “happy path” is where everything goes right.

Vibes-only coding might get you a login screen that works.

An engineer might build a login screen that’s not only functional but also includes BCrypt for password hashing, try-catch for database timeouts, and an MVC structure with validation.

The Plan: Augment, Don’t Replace 🛠️

To use AI to improve our engineering advantage, we should strive to establish a consistent and disciplined process:

Pseudocode First:
Before we ask the AI, let’s try to write the logic out in plain English or with a basic flowchart. If the logic is indescribable, then we’re unlikely to engineer it well.

The Line-by-Line Rule:
Don’t write code that you can’t describe in excruciating detail. If the generated lambda is confusing, then take a moment to understand how it works at a basic level.

Architectural Refactoring:
AI tools seem to favor “monolithic” code blocks. Use the principle of Separation of Concerns to divide the code into separate, more digestible parts.

Final Thought: Keep the Architect in the Loop 🏢

The idea is to work smarter, not harder. Let AI handle the boilerplates and repetitive tasks, but retain ownership of System Design and Architectural Strategy. This way, your portfolio – and your career rests on actual engineering prowess rather than your ability to quickly prompt AI.

Agent Orchestration Patterns: Swarm vs Mesh vs Hierarchical vs Pipeline

When you move from a single AI agent to multiple agents working together, the first engineering question is: how do they coordinate? The coordination model — the orchestration pattern — determines your system’s latency, fault tolerance, scalability ceiling, and debugging complexity. Pick the wrong pattern and you will spend months fighting coordination overhead instead of shipping features.

This guide breaks down the five core agent orchestration patterns used in production multi-agent systems. For each pattern, we cover the architecture, where it excels, where it breaks, and real-world implementations. If you are new to multi-agent systems, start with our complete guide to AI agent architectures for the foundational taxonomy.

The Five Core Orchestration Patterns

Every multi-agent system in production today maps to one of five orchestration patterns, or a hybrid of two or more. These patterns are not theoretical — they emerge from the same distributed systems constraints that shaped microservice architectures a decade ago: coordination cost, failure isolation, throughput requirements, and observability.

The five patterns are: Orchestrator-Worker (centralized control with fan-out), Swarm (decentralized emergent coordination), Mesh (peer-to-peer direct communication), Hierarchical (tree-structured delegation), and Pipeline (sequential stage processing). Each pattern makes fundamentally different trade-offs between control, flexibility, and operational complexity.

Understanding these patterns is essential if you are building multi-agent orchestration at scale. Microsoft’s AI agent design patterns taxonomy identifies these same categories as foundational building blocks. Pattern selection is consistently the highest-leverage architectural decision in multi-agent systems — it constrains every subsequent implementation choice.

Orchestrator-Worker Pattern

The orchestrator-worker pattern is the most widely deployed pattern in production AI systems. A single orchestrator agent receives a task, decomposes it into subtasks, assigns each subtask to a specialized worker agent, and aggregates the results. Workers do not communicate with each other — all coordination flows through the orchestrator. This is the hub-and-spoke model applied to AI.

The orchestrator maintains global state, handles error recovery, and decides when the overall task is complete. Workers are stateless (or maintain only local state) and focus on a single capability: one worker handles database queries, another writes code, another calls external APIs. LangGraph’s supervisor pattern and AutoGen’s group chat with a selector agent both implement this architecture.

Orchestrator-worker is the default starting pattern for good reason. It is the easiest to debug because there is a single control flow to trace. It scales horizontally by adding workers. And it maps naturally to customer support use cases where a routing agent triages incoming tickets by intent — billing, technical, account management — and dispatches them to specialized resolution agents. Each worker resolves its ticket independently and reports the result back to the orchestrator. This is the architecture behind platforms that run hundreds of support agents with 90%+ autonomous resolution rates.

When Orchestrator-Worker Works

  • Customer support triage and resolution (route, resolve, verify)
  • Document processing where a coordinator splits pages across extraction workers
  • Code generation workflows where a planner distributes tasks to file-specific agents
  • Any workload where subtasks are independent and do not require inter-worker communication

When Orchestrator-Worker Breaks

The orchestrator is a single point of failure and a throughput bottleneck. If the orchestrator’s LLM call takes 3 seconds and you have 20 workers waiting for assignments, your decomposition throughput ceiling is approximately 6.7 tasks per second. The orchestrator also becomes a context window bottleneck: it must hold the full task description, all worker results, and enough context to synthesize a final answer. For tasks that produce 50+ intermediate results, this exceeds current context window limits even on 128k-token models.

Swarm Pattern

The swarm pattern eliminates centralized control entirely. Agents operate as autonomous peers that make local decisions based on shared state, environmental signals, or pheromone-like markers. There is no orchestrator. Coordination emerges from simple local rules applied by many agents simultaneously — the same principle behind ant colonies, bird flocking, and blockchain consensus. No single agent needs to understand the full system.

In AI systems, swarm agents typically share a blackboard (a shared memory or state store) and use handoff protocols to transfer tasks. OpenAI’s Swarm framework popularized this approach: each agent has a set of functions and can hand off to another agent when it encounters a task outside its specialization. The key insight is that each agent only needs to know when to hand off and to whom — not the full task decomposition plan.

Swarm patterns excel at exploration tasks where the problem space is large and the optimal path is unknown. Research workflows, competitive intelligence gathering, and large-scale web scraping all benefit from swarm coordination because agents explore different branches of the search space independently and share discoveries through the blackboard. A swarm of 50 research agents can explore 50 hypotheses in parallel without any central coordinator planning the search.

Swarm Trade-offs

The primary risk is observability. With no central coordinator, tracing a task from start to finish requires reconstructing the handoff chain from distributed logs. Debugging a swarm is like debugging an eventually-consistent distributed database — you need specialized tooling (distributed tracing, event sourcing, blackboard snapshots). Swarms also struggle with tasks that require strict ordering or transactional guarantees because there is no global arbiter to enforce sequence.

Another challenge is convergence: how does the system know when it is done? Without an orchestrator deciding when to stop, swarm agents need explicit termination conditions — maximum iterations, quality thresholds, or timeout-based convergence. Design these conditions carefully; overly aggressive termination produces incomplete results, while overly conservative termination burns tokens and compute. For a deeper comparison of frameworks that implement swarm patterns, see our analysis of the best multi-agent frameworks in 2025.

Mesh Pattern

Mesh is often confused with swarm, but they solve different problems. In a mesh, agents maintain persistent, explicit connections to specific peers and communicate directly. Think of the difference between a crowd passing messages through a shared bulletin board (swarm) and a team on a group call where everyone can address anyone directly (mesh). In a mesh, Agent A knows it needs Agent B for database queries and Agent C for authentication logic. The communication graph is explicit and typically defined at deploy time.

Mesh patterns shine in systems where agents need to negotiate, share intermediate state, or iterate on a shared artifact. The canonical example is a multi-agent coding system where a planning agent, coding agent, and testing agent form a tight feedback loop: the planner generates a specification, the coder implements it, the tester validates it, and failures route back to the coder with specific error messages and stack traces. This three-agent mesh iterates until all tests pass — typically 2–5 iterations for moderately complex features.

Confluent’s research on event-driven multi-agent systems demonstrates how mesh patterns can be built on event streaming platforms like Kafka. Each agent publishes events to topics and subscribes to topics from peer agents. This decouples agents at the transport layer while maintaining the logical mesh topology. The result is a system where individual agents can scale independently, restart without losing state, and be replaced without reconfiguring peer connections.

Mesh Complexity Considerations

The primary risk with mesh is combinatorial explosion. A full mesh of N agents has N(N-1)/2 potential connections. At 5 agents, that is 10 connections. At 10 agents, it is 45. At 50 agents, it is 1,225. Each connection represents a potential failure point and a communication channel that needs monitoring. In practice, meshes work best with 3–8 tightly coupled agents. Beyond that, decompose into smaller meshes coordinated by a higher-level pattern — which brings us to hierarchical orchestration.

Hierarchical Pattern

The hierarchical pattern organizes agents in a tree structure with multiple levels of delegation. A top-level manager agent delegates to mid-level supervisor agents, which in turn delegate to leaf-level worker agents. Each level adds a layer of abstraction: the top level reasons about strategy, mid-levels reason about tactics, and leaf-level agents execute specific actions.

This mirrors how large engineering organizations operate. A VP sets the product direction, engineering managers translate that into sprint plans, and individual engineers write the code. The hierarchical pattern applies the same division of labor to AI agents. CrewAI’s hierarchical process is a direct implementation: a manager agent breaks down goals into sub-goals, assigns sub-goals to team leads, and team leads coordinate individual agent tasks.

The critical advantage of hierarchical orchestration is context window management. No single agent needs to hold the full context of the entire system. The top-level agent holds the high-level goal and summary results from each branch. Mid-level agents hold their team’s context. Workers hold only their specific subtask input and tools. This allows hierarchical systems to tackle problems that would overflow any single agent’s context window — like auditing an entire codebase or processing thousands of documents simultaneously.

Hierarchical Drawbacks

Latency compounds at every level. A three-level hierarchy with 2-second LLM calls at each level adds a minimum 6 seconds of coordination overhead before any worker starts executing. At four levels, it is 8 seconds. Information loss is another critical concern: each summarization step between levels risks dropping details that turn out to be essential. A worker might produce a nuanced finding that gets compressed to a single sentence by the mid-level supervisor, losing the context that the top-level manager needed to make the right decision.

For workloads where the task can be decomposed into a fixed taxonomy of subtypes, consider whether a mixture-of-experts (MoE) model might replace the first two levels of your hierarchy with a single routing layer, reducing latency while preserving specialization.

Pipeline Pattern

The pipeline pattern processes data through a fixed sequence of agent stages. Each stage receives input from the previous stage, transforms or enriches it, and passes output to the next stage. This is the assembly line of agent orchestration. The order of operations is predetermined and does not change at runtime.

Classic pipeline implementations include content generation (research, outline, draft, edit, publish), data enrichment (extract, validate, normalize, store), compliance checking (ingest document, extract claims, verify each claim, generate report), and SEO workflows (keyword research, SERP analysis, brief generation, content writing). Each stage is handled by a specialized agent optimized for that specific transformation. The stage boundaries create natural checkpoints for human review in semi-automated systems.

Pipelines are the easiest pattern to monitor and optimize. Each stage has clear input/output contracts, measurable latency, and isolated failure modes. You can profile stages independently, swap out the LLM model at any stage without affecting others, use a cheaper model for simple extraction stages and a more capable model for reasoning stages, and add stages without restructuring the system. Production pipelines often include quality gates between stages — lightweight validation agents that check whether output meets the threshold for the next stage or needs rework by the current stage.

Pipeline Limitations

Pipelines cannot handle tasks where the execution order depends on intermediate results. If stage 3’s output determines whether you should run stage 4A or stage 4B, you need conditional branching — at that point, you are evolving toward an orchestrator-worker or hierarchical pattern with decision nodes. Pipelines also have the longest cold-start latency for interactive use cases because every request must traverse all stages sequentially. A 5-stage pipeline with 2-second stages adds a minimum 10-second end-to-end latency, which is unacceptable for real-time chat but perfectly fine for batch processing.

Comparison Matrix

The following matrix summarizes the key trade-offs across all five patterns. Each pattern is evaluated on six dimensions that matter most in production deployments.

Orchestrator-Worker — Control: high. Scalability: medium (bottlenecked by orchestrator throughput). Fault tolerance: low (orchestrator is single point of failure). Debugging: easy (single control flow to trace). Best for: customer support, task decomposition, fan-out workloads. Typical latency: 2–5 seconds per task.

Swarm — Control: low. Scalability: high (no coordination bottleneck). Fault tolerance: high (no single point of failure, agents are replaceable). Debugging: hard (requires distributed tracing and blackboard replay). Best for: exploration, research, parallel data gathering. Typical latency: variable, depends on convergence conditions.

Mesh — Control: medium. Scalability: low (N-squared connection growth). Fault tolerance: medium (graceful degradation when peers disconnect). Debugging: medium (known topology, traceable connections). Best for: collaborative reasoning, iterative refinement, code review loops. Typical latency: 5–15 seconds per iteration cycle.

Hierarchical — Control: high. Scalability: high (tree structure scales logarithmically). Fault tolerance: medium (branch failures are isolated). Debugging: medium (level-by-level trace, summarization loss). Best for: complex multi-domain enterprise tasks, 20+ agent deployments. Typical latency: 6–12 seconds minimum (stacks per level).

Pipeline — Control: high. Scalability: medium (limited by slowest stage). Fault tolerance: low (single stage failure blocks entire pipeline). Debugging: easy (stage-by-stage inspection with clear I/O contracts). Best for: content generation, data processing, ETL, batch workflows. Typical latency: predictable, cumulative across stages.

How to Choose the Right Pattern

Pattern selection depends on four factors: task structure (are subtasks independent or interdependent?), latency requirements (interactive real-time vs. batch processing), scale (how many agents and concurrent tasks?), and observability needs (how important is end-to-end traceability for compliance or debugging?).

Decision Framework

Start with these five questions to narrow your options.

  1. Are subtasks independent with no inter-agent communication needed? Start with Orchestrator-Worker.
  2. Do tasks follow a fixed, predictable sequence with clear stage boundaries? Use Pipeline.
  3. Do 3–8 agents need to iterate on a shared artifact until quality converges? Use Mesh.
  4. Is the problem space large and the optimal solution path unknown? Use Swarm.
  5. Do you need 20+ agents operating across multiple domains? Use Hierarchical.

For customer support automation, orchestrator-worker is the proven default. The orchestrator acts as a triage and routing layer that classifies incoming tickets by intent (billing, technical, account management) and dispatches to specialized resolution agents. Each worker handles its domain independently with domain-specific tools and knowledge bases. The orchestrator tracks SLAs, escalates to humans when confidence drops below threshold, and logs the full resolution chain for quality review.

For research and analysis workflows, start with a pipeline and add swarm elements where you need exploration. A research system might use a pipeline for the core flow (define question, gather sources, extract findings, synthesize report) but deploy a swarm of 20 gathering agents in the second stage to search diverse sources in parallel. The pipeline guarantees the overall process completes in order; the swarm maximizes coverage during the gathering phase.

For enterprise-scale deployments with 50+ agents across multiple business domains, hierarchical is typically the only viable option. IBM’s research on AI agent orchestration confirms that hierarchical decomposition is the standard approach for large-scale enterprise agent systems. Domain-specific agent clusters — customer support, sales operations, IT automation — are each managed by supervisors, and supervisors report to a top-level strategic coordinator.

In practice, most production systems use hybrid patterns. A hierarchical system where the leaf-level teams use mesh coordination internally. A pipeline where one stage spawns a swarm for parallel data collection. The patterns are composable, and the best architectures combine them based on each subsystem’s requirements. For implementation guidance, see our framework comparison for 2025, which maps each framework to the patterns it natively supports.

FAQ

What is the difference between swarm and mesh orchestration?

Swarm agents coordinate through shared state (a blackboard or environment signals) without direct peer-to-peer connections. Coordination is emergent — agents follow local rules and global behavior arises from many agents acting independently. Mesh agents maintain explicit, persistent connections to specific peers and communicate directly through defined channels. Swarm topology emerges at runtime; mesh topology is defined at design time. Use swarm when the solution path is unknown and you need broad exploration. Use mesh when a known, small group of agents (3–8) needs to iterate on a shared artifact.

Can I combine multiple orchestration patterns in one system?

Yes, and most production systems do. The patterns are composable at the subsystem level. A common hybrid uses hierarchical orchestration at the top level with orchestrator-worker teams at the leaf level. Another hybrid uses a pipeline for the main workflow with a swarm at one stage for parallel data collection. The key is to choose the pattern that fits each subsystem’s specific requirements — task structure, latency tolerance, agent count — rather than forcing one pattern across the entire architecture.

Which orchestration pattern is best for customer support?

Orchestrator-worker is the proven default for customer support automation. The orchestrator acts as a triage and routing layer that classifies incoming tickets by intent (billing, technical, account management) and dispatches to specialized resolution agents. Each worker handles one domain with domain-specific tools and knowledge. This pattern provides clear audit trails for every resolution, simple escalation paths when confidence is low, and straightforward horizontal scaling by adding workers for new support categories. It is the architecture used by platforms handling thousands of tickets daily with 90%+ autonomous resolution rates.

Originally published on GuruSup Blog. GuruSup runs 800+ AI agents in production for customer support automation. See it in action.

Git Archaeology #8 — Engineering Relativity: Why the Same Engineer Gets Different Scores

The same object is lighter on the Moon and heavier on Jupiter. The same thing happens in codebases.

Previously

In Chapter 7, I talked about the universe-like structure of codebases — gravity, four forces, and “seasoned, good gravity.”

This chapter is about another fundamental property of that gravity.

Gravity Changes with the Universe

Looking at EIS results across different codebases, I noticed something.

Gravity changes depending on the universe.

EIS measures “how much gravity you created” in a codebase. But gravity has one critical property:

It depends on the space it exists in.

In physics, Earth, the Moon, and Jupiter each have different gravitational fields. The same object becomes lighter or heavier depending on where it is.

The same phenomenon occurs in codebases.

The same engineer gets different EIS scores in different codebases.

Mature Universes and Young Universes

In a mature codebase:

  • Structure is stable
  • Architects already exist
  • Abstractions are well-established
  • “Seasoned, good gravity” is already present

In such environments, creating new gravity is not easy. The stronger the existing structure, the more energy it takes to shift the center. EIS scores are harder to raise.

In a structurally weak codebase:

  • No central structure exists
  • Design is fragmented
  • Abstractions are lacking

In such environments, new gravity forms easily. The first person to introduce decent design becomes an Architect overnight. EIS scores are easier to raise.

EIS Is Not an Absolute Value

This means EIS is not an absolute value.

EIS is determined not by an engineer’s ability alone, but by the interaction between the engineer and the codebase’s gravitational field.

This is, in a sense —

Engineering Relativity.

The same engineer, in a different universe, produces different gravity.

The Trap of Raw Numbers

This has important implications for engineering evaluation.

Imagine an engineer whose scores look like this:

Repo A (Backend API)           Total: 35
Repo B (New microservice)      Total: 60

Naturally, 60 looks “better.”

But if Repo A has an extremely strong gravitational field — multiple Architects, highly refined structure, battle-tested abstractions — then 35 in that context may actually be remarkable.

There’s a “normalization trap” here. EIS’s relative normalization means the top contributor in each team scores 100 — so the top score in one repo might be mediocre in another. But this chapter’s point is more fundamental than normalization mechanics. Normalization is a calculation issue; Engineering Relativity is a structural issue.

The codebase itself changes the meaning of the score.

That’s Engineering Relativity.

Reading EIS with Relativity in Mind

How do you account for this relativity when reading EIS? Here are some approaches.

1. Check Team Classification

Look at eis analyze --team:

Structure: Architectural Engine  →  Strong gravitational field (scores are hard-earned)
Structure: Unstructured          →  Weak gravitational field (scores come easily)

Total: 40 inside an Architectural Engine and Total: 40 inside an Unstructured team have completely different meanings.

2. Look at Architect Density

The more Architects on a team, the harder it is to raise your Design axis. This is a natural consequence of relative normalization. Scoring Design: 60 in a team with three Architects is likely harder than scoring Design: 100 in a team with none.

3. Use --recursive for Cross-Repo Analysis

eis analyze --recursive ~/workspace

Analyzing across multiple repositories reveals an engineer’s “gravity beyond a single universe.” Producer in one repo, Architect in another — that pattern reveals adaptability and latent capability.

4. Watch “Gravitational Field Changes” in Timelines

eis timeline --span 6m --periods 0 --recursive ~/workspace

Codebase structure isn’t static. Member departures, refactoring, new features — these shift the gravitational field. In timelines, you can distinguish “engineers whose scores rise when structure weakens” from “engineers who maintain stable scores regardless of structural strength.”

The Reproducibility of Architects

Here’s where it gets interesting.

Truly great engineers create gravity in any universe.

Different codebase. Different team. Different tech stack. They still build structural centers.

This might be called Architect Reproducibility.

When you analyze an entire workspace with --recursive, an engineer who is consistently Architect across multiple repositories has “general-purpose design capability” that doesn’t depend on any specific codebase.

Conversely, an engineer who is Architect in only one repository is creating gravity within that repository’s specific context. This is also valuable, but it’s a different kind of strength.

EIS cross-repository analysis makes this reproducibility numerically verifiable:

Author     Backend API    Frontend    Firmware   Pattern
machuz     Architect      Architect   Architect  Reproducible
alice      Architect      Producer    —          Context-dependent
bob        Producer       Producer    Producer   Consistently Producer

Gravitational Lensing: When Others’ Scores Reveal Your Gravity

There’s a subtler phenomenon worth noting — one borrowed from astrophysics.

In physics, you can detect massive objects not by looking at them directly, but by observing how they bend the light of objects behind them. This is gravitational lensing.

In codebases, something similar happens. An Architect’s gravity is sometimes most visible not in their own scores, but in how it shapes everyone else’s scores.

When a strong Architect is present:

  • Other engineers’ Survival scores may be lower (the Architect’s code dominates blame)
  • The team’s Design axis distribution is skewed (one person absorbs most architectural changes)
  • New joiners’ scores reveal a characteristic “ramp-up curve” — they start low and gradually contribute to the existing structure

When that Architect leaves:

  • Multiple engineers’ scores shift simultaneously
  • Design Vacuum risk appears
  • The “flattening” of score distributions signals the loss of a gravitational center

You can observe this in eis timeline --team: the moment a gravitational center disappears, the entire team’s metrics ripple. The gravity was real — you just needed to look at its effects on others to see its full shape.

What Relativity Teaches Us

Engineering Relativity might seem like a “limitation” of EIS. If scores change with the environment, how can you make fair comparisons?

But I see this not as a limitation — as a feature.

When relativity was discovered in physics, the fact that “there is no absolute time or space” was counterintuitive. But accepting it led to an exponentially deeper understanding of the universe.

EIS is the same.

The fact that scores change with environment teaches us that comparing engineers while ignoring their environment is inherently meaningless.

An engineer’s real capability cannot be measured in a vacuum. It always exists in relationship with the codebase — the universe they operate in.

Great engineers create gravity in any universe.

Truly great engineers create gravity in any universe.

But that gravity looks different depending on the universe.

That’s Engineering Relativity.

Series

  • Chapter 1: Measuring Engineering Impact from Git History Alone
  • Chapter 2: Beyond Individual Scores: Measuring Team Health from Git History
  • Chapter 3: Two Paths to Architect: How Engineers Evolve Differently
  • Chapter 4: Backend Architects Converge: The Sacred Work of Laying Souls to Rest
  • Chapter 5: Timeline: Scores Don’t Lie, and They Capture Hesitation Too
  • Chapter 6: Teams Evolve: The Laws of Organization Revealed by Timelines
  • Chapter 7: Observing the Universe of Code
  • Chapter 8: Engineering Relativity: Why the Same Engineer Gets Different Scores (this post)

GitHub: engineering-impact-score — CLI tool, formulas, and methodology all open source. brew tap machuz/tap && brew install eis to install.

If this was useful:

Sponsor

Building Agent Emulator: Habbo emulator + MCP 👨‍💻🏨

Running Claude AI Agents Inside a Habbo Hotel — via MCP

I built something a bit weird and a lot of fun: a local Habbo Hotel emulator where Claude AI agents can walk around, talk to players, hand out credits, moderate rooms, and manage accounts — all through an MCP server.

Here’s how it works and how you can run it yourself.

What is this exactly?

Habbo Hotel is a browser-based virtual world from the early 2000s. There are open-source emulators (I’m using Arcturus Morningstar) that let you run your own private hotel locally.

The twist: I connected Claude to it using the Model Context Protocol (MCP) — Anthropic’s standard for giving AI agents tools to interact with external systems. Instead of connecting Claude to a database or an API, I connected it to a virtual hotel.

Claude can now:

  • Create player accounts and generate login URLs
  • Talk, shout or whisper as any online player
  • Give credits, duckets, diamonds, and badges
  • Teleport players between rooms
  • Kick and mute players
  • Read room chat logs
  • Broadcast hotel-wide alerts
  • Set player ranks

All triggered naturally in conversation, using Claude Code hooks.

The architecture

Claude (Claude Code + MCP client)
        │
        ▼
  habbo-mcp server  ──── MySQL (player data, chat logs)
        │
        ▼
  RCON TCP socket
        │
        ▼
  Arcturus emulator  (Docker)
        │
        ▼
  Nitro frontend  (browser client)

Three pieces:

1. The emulator stack (Docker)
Arcturus + MariaDB + Nitro React frontend, all in Docker Compose. One command starts a fully functional private Habbo hotel accessible in your browser.

2. The MCP server (Node.js / TypeScript)
A small MCP server that exposes hotel actions as tools. It talks to Arcturus over RCON (a raw TCP protocol Arcturus uses for server-to-server commands) and directly queries MySQL for read operations.

3. Claude Code with hooks
Claude Code connects to the MCP server and can use all 16 tools. With hooks you can trigger Claude automatically on events — for example, have it greet every new player that logs in, or moderate chat in real time.

Running it locally * Comming soon!

Clone the repo and run the setup script:

git clone ***/habbo-agent-emulator
cd habbo-agent-emulator
./setup.sh

That one command:

  • Checks you have Docker, Node.js 18+, and npm installed
  • Generates a random MCP API key
  • Writes habbo-mcp/.env with all connection details
  • Patches rcon.allowed in the emulator config automatically
  • Runs npm install
  • Prints the exact JSON block to paste into ~/.claude/settings.json

Then start the hotel:

cd emulator && just start-all

First run takes a few minutes to build. After that, open http://localhost:1080/?sso=123 in your browser, restart Claude Code, run /mcp and you should see habbo listed with all tools ready.

No manual config editing, no hunting for the right IP — the script handles the annoying parts.

What’s next

The repo is going public on GitHub soon. A few things I want to explore:

  • Autonomous NPCs — persistent Claude agents that live in the hotel, have a personality, and respond to players naturally
  • Event-driven hooks — trigger Claude on chat messages, room joins, or trades
  • Multi-agent setups — multiple Claude instances playing different roles (host, guide, moderator)

MCP makes all of this surprisingly straightforward. The protocol is simple, the tool definitions are just JSON schema, and Claude is genuinely good at deciding when and how to use them.

If you’re into retro web games, AI agents, or just want to see Claude tell someone to “go touch some pixels” in a virtual hotel lobby — stay tuned.

GitHub link coming soon.