Author Archives: DevegygiebyOL

What an OpenAI-Compatible API Router Should Actually Do

An OpenAI-compatible API router should not make your stack more complicated. If it does, it has already failed.

The whole point of compatibility is boring simplicity:

One base URL.

One API key.

Same general SDK shape.

That gives you room to improve the economics without rewriting the application.

For AI coding workflows, this matters because the tool in front is often already good enough. The pain is underneath: cost, provider management, usage logs, and routing.

The minimum useful setup should look familiar:

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://incat.ai/v1",
  apiKey: process.env.OPENAI_API_KEY,
});

If a router requires a large rewrite before you can test it, most developers will not bother. They are right.

The first test should be small:

  • one workflow
  • one API key
  • one prepaid balance
  • one cost comparison

What should the router do?

Route by task

Send routine work to cheaper capable models. Keep risky work on stronger models.

Preserve logs

Developers need to know which workflow burns money.

Avoid surprise bills

Prepaid credits are useful because they turn runaway usage into a visible constraint.

Keep escape hatches

If a cheaper route is not good enough, switch back. Routing should create options, not lock-in.

That is the category I want inCat to live in.

Not another AI coding app.

Not a model museum.

An OpenAI-compatible API router for developers who want the same workflow to cost less.

Generate a config:

https://incat.ai/codex-config-generator.html

Finishing a Read-Only MCP Server: From 6 Tools to 9

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

I took an unfinished open-source MCP server for DEV.to and added the missing half.

The original repo (nickytonline/dev-to-mcp) was built by an actual DEV.to engineer and shipped six read-only tools: get_articles, get_article, get_user, get_tags, get_comments, search_articles. Useful for reading, useless for writing.

I extended it with three write tools:

  • create_article for publishing new articles (draft or live)
  • update_article for editing existing ones
  • delete_article for unpublishing

The result is a full read-write MCP server that lets Claude (or any MCP client) treat DEV.to like a CMS. This article was created and published using it.

Demo

The tool list in Claude Desktop after the build:

Read-only tools (6):
  Get Articles, Get Article, Get User, Get Tags, Get Comments, Search Articles

Write/delete tools (3):
  Create Article, Update Article, Delete Article

A draft creation call looks like this:

{
  "tool": "create_article",
  "args": {
    "title": "My new post",
    "body_markdown": "# Hello world",
    "tags": ["webdev", "ai"],
    "published": false
  }
}

The MCP server hits POST https://dev.to/api/articles with the user’s DEVTO_API_KEY from env, returns the article ID, and Claude can immediately call update_article against it. No browser, no copy-paste from chat to editor.

The Journey

The original repo was solid but limited. I asked myself: why use an MCP server that can only read?

Setup was the first wall. The npm package wasn’t published, so npx -y @nickytonline/dev-to-mcp returned 404. Then npm install -g github:... failed because the repo had no top-level package.json at the install path npm expected. The fix was unglamorous: git clone, npm install, npm run build, point Claude Desktop’s config at the local dist/index.js.

There was also a Windows-specific gotcha. Claude Desktop on Windows needs npx.cmd, not npx. The error message was just Server disconnected. Logs showed bad option: -y because the config still had the npx flag while the command had been swapped to node. Small things, two hours.

Once the read-only server was running, the actual finish-up work was straightforward. The codebase used a clean handler pattern: each tool was a function that called the DEV.to API and returned a typed response. I followed the same pattern for the three new tools:

// Pattern from the existing read tools
async function getArticle(id: number) {
  const res = await fetch(`https://dev.to/api/articles/${id}`, {
    headers: { 'api-key': process.env.DEVTO_API_KEY }
  });
  return res.json();
}

// New write tool, same shape
async function createArticle(article: ArticleInput) {
  const res = await fetch('https://dev.to/api/articles', {
    method: 'POST',
    headers: {
      'api-key': process.env.DEVTO_API_KEY,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({ article })
  });
  return res.json();
}

Register the new handlers in the MCP server’s tool list, rebuild with npm run build, restart Claude Desktop. Done.

Tech Stack

  • TypeScript for the server code
  • Vite for the build (12.83 kB output, builds in 133ms)
  • Model Context Protocol SDK for the server scaffolding
  • DEV.to API v1 as the backend
  • Claude Desktop as the MCP client

What I Learned

Two things stood out.

First, finishing someone else’s project is faster than starting from scratch. The repo had types, patterns, error handling, and tests already in place. Adding three tools meant matching an existing shape, not inventing one. The full add-rebuild-test cycle was under thirty minutes.

Second, AI assistance works best when there’s existing structure to imitate. I gave Claude Code the repo path and asked for three tools matching the existing pattern. It read the codebase, identified the handler signature, knew the DEV.to API endpoints from training data, and produced working code on the first build. Without the existing read-only tools as reference, the output would have needed more iteration.

What’s Next

The MCP server still has gaps:

  • No image upload (DEV.to requires base64 inline or external URLs)
  • No get_followers or get_following
  • No comment write/delete
  • No analytics endpoints

These are small additions following the same pattern. The hard part was the first three.

Repo

The forked repo with write tools: github.com/glatinone/dev-to-mcp

Original credit to @nickytonline for the read-only foundation.

How Excel is used in Real World Data Analysis

I’ve always known Excel as a tool for creating tables and performing simple calculations. However, after spending a week learning its fundamentals, I now understand why Excel remains one of the most widely used tools in data analysis.
Microsoft Excel is a spreadsheet application that allows users to collect, organize, clean, analyze, calculate, and visualize data. Its user-friendly interface and powerful features make it a valuable tool for individuals and organizations across different industries.
One way Excel is used in real-world data analysis is in business decision-making. Companies collect large amounts of data on sales, customers, and operations. Analysts use Excel to sort and filter this data, helping managers identify trends, monitor performance, and make informed decisions. For example, a retail business can sort products by sales volume to identify its best-selling items.
Excel is also widely used in financial reporting. Businesses use it to track expenses, calculate profits, prepare budgets, and generate financial reports. With formulas and formatting tools, financial data can be organized in a way that is easy to understand and analyze.
Another common application is marketing performance analysis. Marketing teams collect data from campaigns, websites, and social media platforms. Excel can be used to analyze campaign results, compare performance metrics, and identify which strategies are generating the best outcomes.
Throughout this week, I learned several Excel features and formulas that are useful in data analysis. The first is filtering, which allows analysts to display only the data that meets specific criteria. This is useful when working with large datasets and looking for particular information. I also learned about data validation, which helps maintain data quality by restricting the type of information users can enter into cells. This reduces errors and improves data accuracy.
In addition, I learned functions such as SUM(), AVERAGE(), and COUNT(). SUM() helps calculate totals, AVERAGE() finds the mean value of a dataset, and COUNT() determines how many numerical values exist within a range. These functions make it easier to summarize and understand data quickly. I also found text functions such as TRIM() and PROPER() useful for cleaning and standardizing data before analysis.
Learning Excel has changed the way I see data. Before, I saw data as a collection of numbers and text. Now, I see it as information that can tell a story and support decision-making when properly organized and analyzed. Excel has shown me that effective data analysis begins with understanding how to clean, structure, and explore data. As I continue my journey in data science, I can already see how these foundational Excel skills will support my learning of more advanced tools and techniques.

AI Agent Safety Need Stop Signs, Not Just Instructions

AI agents do not only need better instructions.

They need stop signs.

That is one of the clearest reasons Ota exists as software execution governance for humans and AI agents. A repo should not merely tell an agent what it can try. It should declare what the agent must not do, when it must stop, and what requires human approval.

Prompts and AGENTS.md files are useful. They give agents context: how the project is organized, what style to follow, how to summarize changes, and which areas need caution.

But advice is not a boundary.

An instruction says:

Be careful with database commands.

A stop sign says:

Do not run destructive database commands unless explicitly approved.

An instruction says:

Avoid editing generated files.

A stop sign says:

These paths are protected. Stop if the requested edit falls outside the writable boundary.

That difference matters because modern agents are no longer passive readers. They inspect repos, choose commands, edit files, run checks, interpret failures, and report completion.

If the repo gives them only guidance, they still have to infer the boundary.

Ota’s position is sharper: agent execution should not depend on inference. It should be governed by the repo.

Instructions tell agents what to attempt

Most agent guidance is written as advice.

It says:

  • follow the existing style
  • prefer small changes
  • run tests before finishing
  • avoid touching generated files
  • do not expose secrets
  • explain what changed

That helps. It makes agents less generic and more aware of the repo they are working inside.

But it still leaves the dangerous questions open.

Which tests should the agent run?
Which commands are allowed?
Which files are generated?
Which services require approval?
Which failures mean “fix the code” and which mean “stop and ask”?
Which paths are out of bounds?

A capable agent may make reasonable guesses.

But reasonable guesses are not governance.

For low-risk editing, guidance may be enough. For repo execution, CI, automation, and agentic development, the repo needs something stronger.

Stop signs define when not to continue

A stop sign is not a suggestion.

It is a boundary.

In a repo, stopping rules should cover at least five areas.

1. Secrets and credentials

An agent should not invent secrets, request private values indirectly, or edit sensitive environment files just to make a task pass.

If a command needs an API key, database password, cloud token, or private credential, the correct behavior is not improvisation.

The correct behavior is to stop and report the blocker.

2. External services

Some tasks depend on systems outside the repo: cloud infrastructure, managed databases, payment providers, queues, object storage, or production-like services.

If those services are unavailable, the agent should not patch code around the failure.

It should identify the missing dependency and stop.

3. Unsafe mutation

Some commands change state.

deploy
publish
db:reset
terraform apply

These are not cousins of test, lint, or build.

If a task can mutate external state, delete data, publish packages, or affect infrastructure, the repo should not outsource that decision to the agent’s confidence.

That boundary should be declared.

4. Protected paths

Agents need to know where they can work.

Source files and tests may be open. Generated files, migrations, lockfiles, production config, and environment files may need review or approval.

This is not about slowing the agent down.

It is about preventing quiet damage in files that carry operational weight.

5. Verification limits

Agents also need to know when verification is finite.

A long-running dev server is not a verification result.
A watch mode is not a handoff signal.
A task that never terminates is not the same as a bounded check.

Agent-safe tasks need finite verification paths: run, finish, report status.

Without that, the agent may wait indefinitely, stop too early, or report success without a meaningful result.

This is execution governance

This is bigger than prompt quality.

If an agent runs a risky command, edits a protected file, or treats missing credentials as a code problem, the issue is not only that the agent made a poor choice.

The repo failed to govern execution.

Software execution governance means the repo can declare:

  • what it needs
  • how it should be prepared
  • what can be executed
  • what requires approval
  • where agents can write
  • when verification is complete
  • when execution must stop

That is the frame Ota is built around.

Not “better setup docs.”

Not “another task runner.”

Ota is the contract-first way to make execution boundaries explicit for humans, CI, automation, and AI agents.

How Ota makes stop signs explicit

In an Ota-backed repo, stopping rules do not have to live only in prose.

The contract can declare safe tasks, verification tasks, writable paths, protected paths, setup requirements, and readiness blockers.

That gives agents a governed operating model:

If the task is declared safe, proceed.
If setup is required, prepare from the contract.
If the contract is invalid, stop.
If secrets or credentials are missing, stop.
If the requested edit is outside writable paths, stop.
If the task mutates external state without approval, stop.
If verification is complete, report the result.

That is stronger than telling an agent to “be careful.”

Ota’s agent quickstart follows this same principle: agents should prefer repo-local contracts when they exist, execute declared safe tasks, parse JSON output instead of scraping terminal prose, and stop when blockers involve secrets, credentials, external services, unsafe mutation, or paths outside declared boundaries.

The command surface supports that model:

  • ota doctor checks readiness and surfaces blockers before work begins.
  • ota validate checks whether the contract itself is usable.
  • ota tasks shows what work the repo has declared.
  • ota up --dry-run previews setup before changing the environment.
  • ota run <task> --json runs declared work and returns stable status for automation.

The point is not that every agent action needs ceremony.

The point is that dangerous ambiguity should be removed before execution happens.

AGENTS.md still matters

This does not make AGENTS.md useless.

It means AGENTS.md should do what prose does best: explain context.

Use it for style, conventions, architectural notes, review expectations, and collaboration preferences.

Use Ota for the execution boundary.

A clean split looks like this:

AGENTS.md:
How the agent should behave.

ota.yaml:
What the repo allows, requires, verifies, and refuses.

One gives the agent context.

The other governs the repo.

Together, they produce a better operator: one that understands the project and knows where the guardrails are.

Stop signs build trust

Teams do not trust agents because agents sound confident.

They trust agents when the repo constrains what the agent can do, makes the approved path obvious, and produces evidence for what happened.

A good stop sign does not make agents less useful.

It makes them dependable.

It tells the agent:

Move quickly here.
Slow down here.
Stop here.
Ask here.
Report this.
Do not guess.

That is the behavior serious teams need as AI agents move from code suggestion into repo execution.

Conclusion

AI agents need instructions.

But instructions alone are not enough.

A repo that only tells agents what to do still leaves too much room for unsafe interpretation. The next layer is stopping rules: clear boundaries for secrets, external services, unsafe mutation, protected paths, and finite verification.

That is why Ota’s contract-first model matters.

It turns agent safety from advice into execution governance.

The future of AI-assisted development will not be won by repos that merely prompt agents better.

It will be won by repos that know when agents should stop.

  • Explore the Ota getting started guide
  • Check out the Ota examples repo

Originally posted @ ota.run

How the Internet Actually Works – Networking, DNS, Architecture & My DMI DevOps Journey

Week 0: How the Internet Actually Works – Networking, DNS, Architecture & My DevOps Journey Begins

I recently joined the DevOps Micro Internship (DMI) – Cohort 3, a free, project-based program by Pravin Mishra at CloudAdvisory. Before we dive into the exciting parts – containers, CI/CD pipelines, Kubernetes, cloud platforms – the program correctly insists on mastering the foundational concepts first.

This post documents everything I worked through in Week 0, covering five core tasks and my honest reflections. If you are starting your DevOps journey, this post is for you too. Consider this a beginner-friendly technical reference, not just a journal entry.

Why Foundations Matter in DevOps

It is tempting to jump straight into Docker or AWS. I get it – the tools look cool, the job postings mention them everywhere, and YouTube tutorials make them seem approachable. But here is the uncomfortable truth: tools break, documentation changes, and architectures evolve. What does not change nearly as fast is the underlying fundamentals.

A DevOps engineer who understands how data actually travels across a network, why DNS exists, and how application layers are separated will debug production incidents faster, design more resilient systems, and adapt to new tools with far less friction. That is the mindset behind Week 0.

Let’s get into it.

Task 1 – Exploring Concepts with AI: Networking Protocols

The first task involved using ChatGPT to explore networking protocols from first principles. The goal was not just to get an answer, but to learn how to ask precise questions and synthesise the response into a genuine understanding.

What Are Networking Protocols?

A networking protocol is a standardised set of rules that governs how data is transmitted between devices on a network. Without protocols, two devices attempting to communicate would be like two people trying to have a conversation, one speaking English and the other French, with no shared framework.

Protocols define:

  • Format: What does a valid message look like?
  • Sequencing: Who speaks first? Who speaks next?
  • Error handling: What happens when something goes wrong?
  • Termination: How does the conversation end cleanly?

Think of it like road traffic laws. The laws do not build the roads, but they ensure that everyone using the roads does so in a predictable, safe, and efficient manner. Without them, even a perfectly built road would result in chaos.

Key Insight from This Task

What struck me most was how protocols operate in layers. No single protocol handles everything. Instead, a stack of protocols each handles a specific concern, and together they make the internet function. This layered thinking – breaking a complex problem into isolated, composable responsibilities – is also a core principle in software architecture and DevOps. I would encounter it again and again as the week progressed.

Task 2 – Internet & Networking Fundamentals

This task required me to explain four foundational concepts in my own words. Here is my in-depth take on each.

Packet Switching

When you send a message, a file, or a video stream across the internet, that data is not sent as one giant, continuous stream. Instead, it is broken into small chunks called packets. Each packet contains a piece of the actual data (the payload), plus metadata – the source address, destination address, sequence number, and error-checking information.

These packets do not all travel the same route. Routers across the internet evaluate network conditions in real time and forward each packet along the most efficient path available at that moment. At the destination, the packets are reassembled in the correct order.

Why does this matter? Packet switching is what makes the internet resilient. If one network link fails, packets are simply rerouted. No single point of failure can take down the entire communication. This is a fundamentally different (and superior) model to the old circuit-switched telephone network, where a dedicated line had to remain open for the entire duration of a call.

The DevOps connection: When you are debugging network latency or packet loss in a distributed system, understanding packet switching tells you why packets arrive out of order, why retransmission happens, and where to look when something is slow.

IP Address

An IP (Internet Protocol) address is a numerical label assigned to every device on a network. It serves two core purposes: host identification (which device is this?) and location addressing (where is this device on the network?).

There are two versions currently in use:

Version Format Example Address Space
IPv4 32-bit, four octets 192.168.1.1 ~4.3 billion addresses
IPv6 128-bit, eight groups 2001:0db8:85a3::8a2e:0370:7334 ~340 undecillion addresses

The world ran out of IPv4 addresses years ago. Techniques like NAT (Network Address Translation) have extended IPv4’s lifespan by allowing multiple devices on a private network to share a single public IP, but IPv6 adoption is the long-term solution.

The DevOps connection: You will work with IP addresses constantly – assigning them to servers, configuring security group rules, setting up load balancers, and troubleshooting connectivity. Understanding the difference between public and private IPs, and how subnetting works, is essential for cloud networking on AWS, GCP, or Azure.

TCP/IP

TCP/IP is not one protocol but a suite of protocols. The two most important are:

IP (Internet Protocol) – handles addressing and routing. It is responsible for getting packets from a source to a destination, but it is connectionless and does not guarantee delivery or order.

TCP (Transmission Control Protocol) – adds reliability on top of IP. Before any data is sent, TCP performs a three-way handshake:

  1. SYN: The client sends a synchronise packet to the server.
  2. SYN-ACK: The server acknowledges and sends its own synchronise.
  3. ACK: The client acknowledges the server’s response.

A connection is now established. TCP then ensures every packet is received, requests retransmission of any lost packets, and delivers data to the application layer in the correct order.

UDP (User Datagram Protocol) is the alternative – connectionless, no handshake, no guaranteed delivery. It is faster, which makes it ideal for video streaming, gaming, and DNS lookups where a dropped packet is less catastrophic than a delay.

The DevOps connection: When you configure a load balancer, you choose between TCP and HTTP (which runs on top of TCP). When you write a Dockerfile exposing a port, you specify TCP or UDP. Understanding this layer is the difference between configuring things by guessing and configuring them with confidence.

HTTP and HTTPS

HTTP (HyperText Transfer Protocol) is the application-layer protocol used to transfer web pages, APIs, and other resources over the internet. It operates on a simple request-response model:

  1. A client (browser, API consumer, CLI tool) sends an HTTP request with a method (GET, POST, PUT, DELETE), headers, and optionally a body.
  2. A server returns an HTTP response with a status code, headers, and optionally a body.

HTTPS (HTTP Secure) wraps HTTP inside TLS (Transport Layer Security), which provides:

  • Encryption: Data in transit cannot be read by third parties (man-in-the-middle attacks are thwarted).
  • Authentication: The server’s identity is verified via a certificate signed by a trusted Certificate Authority (CA).
  • Integrity: Data cannot be tampered with in transit without detection.

The analogy I find most intuitive: HTTP is like sending a postcard – anyone handling it can read what it says. HTTPS is like sending a letter in a tamper-proof, locked box. Only the intended recipient has the key.

The DevOps connection: You will configure TLS certificates using tools like Let’s Encrypt and Cert-Manager. You will set up HTTPS on Nginx or a cloud load balancer. You will debug SSL handshake failures and certificate expiry alerts. Knowing what HTTPS actually does – not just that it “adds a padlock” – makes all of this manageable.

Task 3 – Application Architecture: Two-Tier vs. Three-Tier

Modern applications are not monolithic blobs of code. They are organised into architectural tiers – logical layers that separate concerns, enable independent scaling, and support team-based development. Understanding these tiers is critical for anyone working in DevOps, because you need to know what you are deploying, where each component lives, and how the layers communicate.

Two-Tier Architecture

In a two-tier (client-server) architecture, the application is split into exactly two layers:

┌─────────────────────┐
│    CLIENT TIER      │  ← Presentation + Business Logic
│ (Browser / Desktop) │
└──────────┬──────────┘
           │ Direct DB queries
           ▼
┌─────────────────────┐
│   DATABASE TIER     │  ← Data Storage
│ (MySQL / PostgreSQL)│
└─────────────────────┘

When it works well: Small internal tools, desktop applications with a limited number of users, and rapid prototyping. The simplicity means less infrastructure to manage.

Where it breaks down: The client handles both the UI and business logic. This means every client must be updated when business rules change. It also means clients often have direct database access, which is a serious security concern at scale.

Technologies typically involved:

Tier Examples
Client HTML/CSS, React, Angular, Desktop apps
Database MySQL, PostgreSQL, SQLite

Three-Tier Architecture

Three-tier architecture introduces a dedicated middle layer – the application server (or backend) – between the client and the database.

┌─────────────────────┐
│   PRESENTATION TIER │  ← UI only
│  (Browser / Mobile) │
└──────────┬──────────┘
           │ HTTP/HTTPS requests
           ▼
┌─────────────────────┐
│   APPLICATION TIER  │  ← Business Logic & APIs
│ (Node.js / Django)  │
└──────────┬──────────┘
           │ Parameterised queries
           ▼
┌─────────────────────┐
│     DATA TIER       │  ← Persistent Storage
│ (PostgreSQL/MongoDB)│
└─────────────────────┘

Why this matters:

  • Security: No client ever touches the database directly. The backend validates and sanitises all input before any query is executed.
  • Scalability: Each tier can be scaled independently. If your API is the bottleneck, you spin up more backend instances without touching the frontend or the database.
  • Maintainability: Business logic lives in one place. Change a rule in the backend, and all clients – web, mobile, CLI – immediately reflect that change.
  • Team autonomy: Frontend engineers, backend engineers, and DBAs can work in parallel without constantly stepping on each other.

Technologies typically involved:

Tier Examples
Frontend HTML, CSS, JavaScript, React, Angular, Vue
Backend Node.js, Express.js, Django, Spring Boot, FastAPI
Database MySQL, PostgreSQL, MongoDB, Redis

The DevOps connection: When you write a Kubernetes deployment, you are typically deploying each tier as a separate service with its own pods, resource limits, health checks, and scaling policies. When you design a CI/CD pipeline, you often have separate pipelines for the frontend and backend. When you configure a database, you write network policies that allow only the backend service to connect. Three-tier thinking is baked into modern infrastructure.

Task 4 – Domain Name System (DNS) Deep Dive

DNS is one of those technologies that most people take for granted – until it breaks. When DNS goes down, the internet, from a user’s perspective, ceases to work. Understanding how it works is not optional for a DevOps engineer.

What is DNS?

DNS stands for Domain Name System. Its primary job is to translate human-readable domain names (like epicreads.com) into machine-readable IP addresses (like 52.172.142.222).

Without DNS, you would need to memorise the IP address of every website you want to visit. DNS is the phonebook of the internet.

How DNS Resolution Works (Step by Step)

When you type epicreads.com into your browser and hit Enter, here is what actually happens:

Browser → OS Cache → Recursive Resolver → Root Nameserver
       → TLD Nameserver (.com) → Authoritative Nameserver
       → Returns IP → Browser connects to 52.172.142.222
  1. Browser cache: The browser checks its own cache. Did it look up this domain recently?
  2. OS cache: If not, the operating system checks its own DNS cache (/etc/hosts on Linux, the Windows DNS Client service).
  3. Recursive resolver: If still not found, the query goes to your ISP’s (or a public) recursive resolver, such as 8.8.8.8 (Google) or 1.1.1.1 (Cloudflare). This resolver does the heavy lifting on your behalf.
  4. Root nameservers: The resolver asks a root nameserver. There are 13 sets of root nameservers globally. They do not know the IP of epicreads.com, but they know who is authoritative for .com domains.
  5. TLD nameservers: The .com nameserver knows which nameserver is authoritative for epicreads.com.
  6. Authoritative nameserver: This is the nameserver managed by the domain’s owner (e.g., via AWS Route 53 or Cloudflare). It returns the definitive answer: the IP address associated with epicreads.com.
  7. Response travels back: The IP is cached at multiple levels (with a TTL – Time to Live – that controls how long it stays cached) and returned to the browser.

DNS Record Types

Record Type Purpose Example
A Maps a domain to an IPv4 address epicreads.com → 52.172.142.222
AAAA Maps a domain to an IPv6 address epicreads.com → 2001:db8::1
CNAME Alias – maps one domain to another www.epicreads.com → epicreads.com
MX Mail exchange – specifies mail servers epicreads.com → mail.google.com
TXT Arbitrary text – used for SPF, DKIM, domain verification v=spf1 include:_spf.google.com ~all
NS Nameserver – delegates a domain to specific DNS servers epicreads.com → ns1.cloudflare.com
SOA Start of Authority – metadata about the DNS zone Includes serial number, refresh intervals

Connecting epicreads.com to 52.172.142.222

To map the domain epicreads.com to the IP address 52.172.142.222, you create an A Record in the domain’s DNS zone:

epicreads.com.   300   IN   A   52.172.142.222
  • epicreads.com. – the hostname (the trailing dot indicates the DNS root)
  • 300 – the TTL in seconds (5 minutes); after this time, cached records expire
  • IN – Internet class
  • A – record type (IPv4 address mapping)
  • 52.172.142.222 – the destination IP address

Why not a CNAME? A CNAME maps a name to another name, not to an IP address. CNAMEs also cannot be used at the zone apex (the root domain, e.g., epicreads.com itself) – they can only be used on subdomains. So www.epicreads.com could be a CNAME pointing to epicreads.com, but epicreads.com itself must use an A record.

The DevOps connection: You will configure DNS records constantly – pointing domains to load balancers, configuring subdomains for different services, setting up MX records for transactional email, and adding TXT records to verify domain ownership for SSL certificates. Understanding TTL is critical too: if you set a TTL of 86400 (24 hours) and need to change an IP urgently, you will be waiting a very long time for the change to propagate globally.

Task 5 – Development Environment Setup: Visual Studio Code

A professional development environment is not a luxury – it is the foundation on which all your work is built. I set up Visual Studio Code (VS Code) as my primary editor for this internship.

Why VS Code for DevOps?

VS Code has become the de facto standard for DevOps engineers for several reasons:

  • Language support: From Python and Go to Bash and YAML, VS Code handles everything through its extension marketplace.
  • Integrated terminal: You can run commands without switching windows, which becomes enormously productive over time.
  • Git integration: Built-in source control panel with diff views, staging, committing, and branching.
  • Extension ecosystem: Thousands of extensions for Docker, Kubernetes, Terraform, AWS, Azure, and more.
  • Remote development: The Remote – SSH and Dev Containers extensions allow you to develop directly on remote servers or inside containers, which is invaluable for DevOps workflows.

Key Extensions I Installed

Extension Purpose
HashiCorp Terraform Syntax highlighting, autocompletion for .tf files
Docker Manage containers and images directly from VS Code
Kubernetes Interact with clusters, view pods and logs
YAML Linting and schema validation for Kubernetes manifests, CI/CD configs
GitLens Enhanced Git history, blame annotations, and branch visualisation
Prettier Code formatting for JavaScript, JSON, HTML, CSS
Remote – SSH Develop on remote Linux servers as if they were local

The Broader Toolchain

VS Code is just the editor. A complete DevOps development environment also includes:

  • Git – version control (non-negotiable for every project)
  • A terminal – WSL2 on Windows, or the built-in terminal on macOS/Linux
  • Node.js / Python – scripting and automation
  • Docker Desktop – container runtime for local development
  • A cloud CLI – AWS CLI, Azure CLI, or gcloud, depending on your target platform

Getting comfortable with these tools before working on live infrastructure is essential. Mistakes in a local environment are free. Mistakes in production are expensive.

Reflection: Week 0 in Honest Review

What I Found Easy

The networking and DNS sections came naturally to me. These concepts map closely to everyday experiences – browsing websites, using email, navigating apps – so the mental models were already partially in place. I found that once you have the right analogy (packets as parcels, DNS as a phonebook, HTTPS as a locked envelope), the technical details click into place quickly.

What Was Difficult

Application architecture – specifically the distinction between two-tier and three-tier designs – required more effort than I anticipated. The concepts sound simple in isolation, but understanding the implications of each architectural decision takes deeper thinking. Why does moving business logic from the client to a dedicated application server change everything about scalability, security, and maintainability? The answer requires holding multiple concerns in mind simultaneously.

I also found that the most challenging part was not understanding what the layers are, but understanding why the separation exists and what goes wrong when it is violated. Reading about real-world examples – monolithic applications that became impossible to scale, data breaches caused by direct client-to-database access – made the architectural principles feel concrete rather than academic.

What I Will Improve Next Week

Hands-on practice with real tools. Reading and writing about networking is valuable, but there is a qualitative difference between understanding how DNS works conceptually and actually configuring a DNS zone, watching propagation happen, and debugging a misconfigured record. My goal for Week 1 is to close the gap between theoretical knowledge and practical muscle memory.

Specifically, I plan to:

  • Practice Linux command-line navigation and file management
  • Work through basic shell scripting exercises
  • Explore cloud console interfaces (starting with AWS)
  • Revisit application architecture by building a minimal three-tier app locally

Key Takeaways

If you have read this far, here is a summary of the most important concepts from Week 0:

  1. Networking protocols are layered. No single protocol handles everything. Understanding the layers prevents tunnel vision when debugging.
  2. Packet switching is what makes the internet resilient. Data takes multiple paths; failures are routed around automatically.
  3. HTTPS is not just about the padlock. It provides encryption, authentication, and integrity – three distinct security guarantees.
  4. Three-tier architecture is the baseline for modern applications. Separation of concerns enables independent scaling, improved security, and team autonomy.
  5. DNS is the phonebook of the internet, and A records map domain names to IPv4 addresses. TTL controls how long these mappings are cached globally.
  6. Your development environment is infrastructure. Set it up thoughtfully, version-control your configurations, and keep it consistent.

If you are following along or if you are on a similar DevOps learning path, feel free to connect in the comments. I would love to hear what foundational concepts you found most challenging – or which ones surprised you the most.

This post is part of my public learning journey through the DevOps Micro Internship (DMI) – Cohort 3 by Pravin Mishra at CloudAdvisory. All tasks completed in this programme are documented openly on this blog.

About DevOps Micro Internship (DMI) & CloudAdvisory
DevOps Micro Internship (DMI) is a free, project-based DevOps learning program by Pravin Mishra (CloudAdvisory). It helps students, job-seekers, and working professionals gain real-world DevOps skills through weekly assignments, projects, and community support.

🌐 DMI Official Website: https://pravinmishra.com/dmi

🎓 DevOps for Beginners: Docker, K8s, Cloud, CI/CD & 4 Projects (Udemy): https://www.udemy.com/course/devops-for-beginners-docker-k8s-cloud-cicd-4-projects/?referralCode=C5BA8236CCE9FE004F98

▶️ DevOps for Beginners – YouTube Playlist: https://www.youtube.com/playlist?list=PLVOdqXbCs7bX88JeUZmK4fKTq2hJ5VS89

🔗 Follow Pravin Mishra on LinkedIn: https://www.linkedin.com/in/pravin-mishra-aws-trainer/

Gemini Model Management: Ending Inefficiency! The Secret to 3x Faster Cost Tracking with Model Registry

Gemini Model Management: Ending Inefficiency – How Model Registry Tripled Our Cost Tracking Speed

Managing our Gemini model system had become a real headache. Model versioning was a mess, and tracking costs for each AI task was incredibly inefficient. I knew something had to change, so I started looking for ways to improve.

Trials and Tribulations

My first thought was to establish a Single Source of Truth. That led me to consider adopting a Model Registry. The idea was to manage all model metadata, version information, and experiment results in one place.

But it wasn’t as straightforward as I’d hoped. Initially, I just focused on storing model information. However, we soon realized a critical need to track costs per AI task and per tier. Trying to shoehorn this cost-tracking functionality into the Model Registry meant messing with the existing structure, which introduced unexpected complexity.

# Initial Model Registry Setup (Conceptual Example)
from google.cloud import aiplatform

aiplatform.init(project='my-gcp-project', location='us-central1')

model = aiplatform.Model.upload(
    display_name='gemini-model-v1',
    artifact_uri='gs://my-bucket/gemini-v1',
    serving_container_image_uri='us-docker.pkg.dev/vertex-ai/prediction/gemini-gpu:20240101'
)

We uploaded models like this, but adding cost-related metadata just didn’t feel right. I wasn’t sure what attributes to use for cost information or how to query it. After hours of struggling, I realized that simply storing model information wasn’t enough.

The Root Cause

Ultimately, the problem wasn’t a lack of functionality in the Model Registry itself, but rather the absence of a clear data schema and an automated logging mechanism for cost tracking. We didn’t have a system to collect and record information in real-time about which model was used for each AI task and which tier it ran on. The Model Registry was great for managing the models themselves, but it didn’t automatically capture the cost context of how those models were being used.

The Solution

To tackle this, I implemented several changes concurrently:

  1. Extended Model Registry Schema for Cost Metadata: Added custom properties to store AI task IDs, tier information, and estimated costs.
  2. Automated Cost Logging During AI Task Execution: Modified the pipeline to calculate and log the estimated cost of each AI task to the Model Registry at the start and end of its execution, along with model information.
  3. Added Policy-Based Automated Validation: Incorporated logic to automatically verify if registered models meet specific cost thresholds or required metadata.
  4. Improved Intent Injection and Decision Logging for Weekly Reports: Ensured that when generating reports, we clearly documented the criteria used for cost aggregation and analysis, as well as the decisions made.
# Adding Cost Information to Model Registry (Improved Example)
from google.cloud import aiplatform

aiplatform.init(project='my-gcp-project', location='us-central1')

# Uploading model with AI job ID and tier information
aiplatform.Model.upload(
    display_name='gemini-model-v1.1',
    artifact_uri='gs://my-bucket/gemini-v1.1',
    serving_container_image_uri='us-docker.pkg.dev/vertex-ai/prediction/gemini-gpu:20240101',
    labels={
        'ai_job_id': 'job-abc-123',
        'tier': 'premium',
        'estimated_cost_usd': '50.00'
    }
)

# Tracking cost for a specific AI job (conceptual)
def log_ai_job_cost(job_id, model_name, tier, actual_cost):
    # Logging to Model Registry or a separate cost tracking DB
    print(f"Logging cost for Job {job_id}: Model {model_name}, Tier {tier}, Cost ${actual_cost}")
    # In a real implementation, you'd use something like aiplatform.Model.update()
    # to update metadata or log to a separate DB.
    pass

log_ai_job_cost('job-abc-123', 'gemini-model-v1.1', 'premium', 55.75)

With these changes, we can now clearly track which AI tasks used which model version, which tier they ran on, and how much they cost.

Results

  • Established a Single Source of Truth: All Gemini model versions, metadata, and associated cost information are now centrally managed in the Model Registry.
  • Increased Cost Efficiency and Transparency: By enabling cost tracking per AI task and tier, we can quickly identify and optimize unnecessary spending. Cost tracking is now over 3x faster than before.
  • Automated and Improved Report Generation: The cost analysis and decision logging required for weekly reports are now automated, significantly reducing manual effort and increasing accuracy.

In Summary — To Avoid the Same Pitfalls

  • [ ] When adopting a Model Registry, plan ahead to design a schema that not only manages the model itself but also tracks cost information related to the model’s usage context (AI tasks, tiers, etc.).
  • [ ] It’s crucial to build a pipeline for automatically logging cost-related metadata during AI task execution.
  • [ ] Add policy-based automated validation to maintain data consistency and accuracy.
  • [ ] Cultivate the habit of clearly logging the decision-making process and its rationale when generating reports.

把 Sa-Token 搬到 NestJS 生态:xlt-token 1.0 的几个设计取舍

最近发布了 xlt-token@1.0.0-rc.1,一个为 NestJS 设计的 Token 鉴权库,灵感来自 Java 生态的 Sa-Token。

仓库:github.com/xiaoLangtou/xlt-token

功能列表看起来不复杂——登录、登出、踢人、权限校验、会话存储——但动手实现时,每个”理应如此”的能力背后都有几个不那么显然的选择。这篇文章想聊聊其中几个,主要是为了自己复盘,也希望对做类似设计的人有参考价值。

为什么不直接用 Passport?

@nestjs/passport 几乎是 NestJS 鉴权的默认答案,但它本质上是个 strategy 调度器——你提供策略(local / jwt / oauth2),它负责调度。它不解决的问题包括:

同账号在第二台设备登录时,第一台应该被踢还是共存?用户被踢下线后,前端拿到 401,怎么区分”token 过期”和”管理员强制下线”?用户连续操作 24 小时不该被踢,但闲置 30 分钟应该自动登出——这两种过期机制怎么同时支持?除了 loginId,还想存最近 IP、设备 ID 等数据,且与 token 同生命周期。

这些是业务侧每次都要重新发明的轮子。Sa-Token 在 Java 生态把它们统一封装好了,我希望 Node 生态也有类似的东西。

但移植不是翻译。Java 的同步阻塞模型、Spring 的注解扫描、JVM 的反射,在 TypeScript 里都得重新设计。下面几个细节就是这种”重新设计”过程中最典型的取舍。

存储键的三层结构

最朴素的方案是 token -> userId 一对一映射:

auth:token:abc123"1001"

但这没法实现顶号。你拿到的是新登录的 userId=1001,不知道这个用户之前用的是哪个 token,要找到它只能遍历所有 key,性能上不可接受。

xlt-token 的方案是三层键空间:

authorization:login:token:<token>          → loginId
authorization:login:session:<loginId>      → token
authorization:login:lastActive:<token>     → timestamp

有了反向索引,登录时的顶号逻辑就是两次 O(1) 的 store 操作:

async login(loginId: string) {
  const oldToken = await store.get(sessionKey(loginId));
  if (oldToken && !isConcurrent) {
    await store.update(tokenKey(oldToken), 'BE_REPLACED');
  }
  const newToken = strategy.create();
  await store.set(tokenKey(newToken), loginId);
  await store.set(sessionKey(loginId), newToken);
  return newToken;
}

后续加上权限和会话后键空间又扩展了几条,但接口契约不变:所有键都是平铺的字符串 KV,可以无差别接到 Memory / Redis / 任何 KV 存储上。

为什么踢人不能删 Key

用户被踢下线时,直觉是直接删掉 tokenKey:

async kickout(loginId) {
  const token = await store.get(sessionKey(loginId));
  await store.delete(tokenKey(token));
  await store.delete(sessionKey(loginId));
}

问题在于,用户下次带着旧 token 来请求,store.get(tokenKey) 返回 null,你没法区分”被踢了”和”token 过期了”——前端只能收到一个通用的 401。

xlt-token 的做法是写哨兵值而不是删除:

async kickout(loginId) {
  const token = await store.get(sessionKey(loginId));
  await store.update(tokenKey(token), 'KICK_OUT');  // 保留 TTL,只改值
  await store.delete(sessionKey(loginId));
}

请求到来时,_resolveLoginId 按顺序判定:token 不存在、值为 null、值为 BE_REPLACED、值为 KICK_OUT、活跃过期、通过——最终前端拿到的 401 响应体可以精确区分六种未登录原因,客户端可以针对每种情况做不同处理(”账号在其他设备登录”和”已被强制下线”是两种截然不同的用户体验)。

哨兵值的 TTL 跟着原 token 的剩余时间走,不会造成内存泄漏。代价是踢人时多写了一条数据,但读场景的诊断精度提升显著。

权限通配符匹配:两个 Bug

P1 加权限校验时要支持 user:* 匹配 user:add / user:edit。第一版写出了这样的代码:

function matchPermission(pattern: string, target: string): boolean {
  pattern.split('').forEach((char, i) => {
    if (char === '*') return true;
    if (char !== target[i]) return false;
  });
  return true;
}

forEach 回调里的 return 只结束当次回调,不结束外层函数,所以任何输入都返回 true,权限校验形同虚设。改用正则:

export function matchPermission(pattern: string, target: string): boolean {
  if (pattern === target) return true;
  if (pattern === '*') return true;

  const regex = new RegExp(
    '^' + pattern.replace(/[.+?^${}()|[]]/g, '$&').replace(/*/g, '.*') + '$'
  );
  return regex.test(target);
}

第二个 bug 藏得更深。权限引擎里有这样的”短路优化”:

async hasPermission(loginId: string, perm: string) {
  const list = await this.stpInterface.getPermissionList(loginId);
  if (list.includes(perm)) return true;
  return list.some(p => matchPermission(p, perm));
}

list.includes 是全等匹配。如果用户拥有 ['user:*'],校验 'user:add' 时,includes 返回 false,才会走到 some(...) 里的通配符匹配——这条路径是对的。但这段”short-circuit”代码本身掩盖了一个事实:includes 不是 matchPermission 的子集,两者语义不同。一旦业务逻辑稍微复杂一点(比如同时有精确权限和通配符权限),这条快路径就可能产生意料之外的行为,而且很难从测试中察觉,因为两条路径独立通过。

最终我把这个短路优化删掉了,性能损失不到 5%(权限列表通常 10~50 项),但逻辑变得线性可推理。

Guard 抽象类里的死代码

NestJS Guard 做鉴权后通常要把用户信息加载到 request.userxlt-token 为此提供了一个抽象基类:

@Injectable()
export abstract class XltAbstractLoginGuard implements CanActivate {
  async canActivate(ctx: ExecutionContext): Promise<boolean> {
    if (!this.requiresLogin(ctx)) return true;

    const request = ctx.switchToHttp().getRequest();
    const result = await this.stpLogic.checkLogin(request);

    if (!result.ok) {
      await this.onAuthFail?.(result, request);
      throw new NotLoginException(result.reason);
    }

    request.stpLoginId = result.loginId;
    await this.onAuthSuccess?.(result, request);
    return true;
  }

  protected requiresLogin(ctx: ExecutionContext): boolean { /* 默认实现 */ }
  protected onAuthSuccess?(result, request): void | Promise<void>;
  protected onAuthFail?(result, request): void | Promise<void>;
}

业务子类只需实现 onAuthSuccess 加载用户信息。看起来很干净——单元测试全绿,提了 PR。

E2E 测试时发现 onAuthFail 永远没有被调用过。回看代码才发现:stpLogic.checkLogin 内部在校验失败时会直接抛出 NotLoginException,所以 if (!result.ok) 这条分支是死代码,onAuthFail 钩子永远到不了。

修复方式是把钩子塞进 catch:

async canActivate(ctx) {
  let result;
  try {
    result = await this.stpLogic.checkLogin(request);
  } catch (e) {
    if (e instanceof NotLoginException) {
      await this.onAuthFail?.({ ok: false, reason: e.message }, request);
    }
    throw e;
  }
}

这个 bug 用单元测试发现不了,因为单元测试通常会 mock checkLogin,让它返回一个 { ok: false } 对象而不是真的抛错。只有把整个 Guard 放进真实 Nest 容器里跑 E2E,才会暴露钩子从来没被触发过这件事。这之后我给项目补了完整的 E2E 测试基建。

StpUtil 静态门面 vs DI

NestJS 最佳实践是一切走 DI,但有些场景 DI 很不方便:全局异常过滤器、工具类 Helper、测试中需要快速 mock 全局认证状态。参考 Sa-Token,xlt-token 同时提供了静态门面:

import { StpUtil } from 'xlt-token';

const token = await StpUtil.login('1001');
const id = await StpUtil.getLoginId(req);

实现是个延迟单例,XltTokenModuleOnModuleInit 时把容器里的实例注入静态变量。两种风格的主要差异:DI 方式可测试性更好、天然支持多实例;静态门面使用更便捷,但是全局单例且必须在 Module init 之后才能调用。

两者并存是有意为之,让用户在不同上下文按习惯选择,底层实现是同一套,行为一致。

数据

1.0.0-rc.1 打包后 gzip 7.4 KB,单测覆盖率 98%+,E2E 覆盖率 95%+,195 个测试用例。依赖只有 es-toolkituuid 和 NestJS peer dep,没有任何 ORM / DB / Redis 强绑定。

未来

1.0 的范畴是完备的单点登录鉴权。1.1.0 计划补齐:二级认证 + 临时 token、多端登录管理(按设备类型互踢)、JWT Strategy(与当前 UUID 策略互切换)、在线用户列表等观测性 API。

详细 Roadmap:xiaolangtou.github.io/xlt-token/roadmap/1-1-0

pnpm add xlt-token@next

文档:xiaolangtou.github.io/xlt-token

GitHub:github.com/xiaoLangtou/xlt-token

1.0.0-rc.1 是发稳定版前的最后窗口期,欢迎 API 命名、类型签名、文档方面的反馈,或者直接开 Issue。

Stop Writing O(n ) Loops — Master the Two Pointers Pattern

`
A hands-on guide to one of the most powerful algorithmic patterns you’ll use every week — with real code in Java, Python, and C.

✍️ Written for developers who want to level up · ⏱ ~25 min read

The Loop That Was Silently Killing Your Code

Let me tell you a story I’m mildly embarrassed about. Early in my coding journey, I was given a classic problem: “Find two numbers in a sorted array that add up to a target.” Simple enough, right? I wrote a nested loop — two for loops, one inside the other — and it worked. All test cases passed. I submitted. Green checkmarks. I felt like a genius.

Then my friend looked at my code and said, “Bro, your solution is O(n²). On an array of a million elements, that’s a trillion operations.” I stared at him blankly. He then showed me a version that did the same thing in O(n) — one single pass — using something called the Two Pointers pattern. It used eight lines of code. Mine used eighteen. His was a hundred times faster at scale. I felt significantly less genius.

The Two Pointers pattern is one of those ideas that, once you see it, you can’t unsee it. You’ll start recognising problems that scream for it. You’ll look at nested loops and feel a twitch. It is not an exaggeration to say it will change how you think about array and string problems. Today, we’re going deep — the intuition, the mechanics, the code in Java, Python, and C, the classic problems, the pitfalls, and a hands-on challenge to cement it all. Let’s go.

What Is the Two Pointers Pattern?

Here is the best real-world analogy I’ve found. Imagine you’re playing a game where you have a row of numbered tiles on the floor. You want to find two tiles whose values add to 10. The dumb way? Start at tile 1, check it against every other tile. Then start at tile 2, check against every other tile. That’s the nested loop approach — exhaustive, boring, slow.

The smart way? You and your friend stand at opposite ends of the row. You call out the sum of your two tiles. If it’s too high, your friend (at the right end) steps inward. If it’s too low, you (at the left end) step inward. You both move toward each other based on the result, and you cover the entire search space in a single coordinated walk. That’s Two Pointers.

Two Pointers is a technique where you maintain two index variables — typically called left and right — and move them strategically through a data structure to solve a problem in a single linear pass, eliminating the need for a nested loop.

There are actually three main flavours of Two Pointers. Understanding which variant to use is half the battle:

Variant 1 — Opposite Ends (Converging Pointers)

Both pointers start at the two ends of the array and move toward each other. Used when the array is sorted and you’re looking for a pair with some property (sum, difference, etc.). This is the classic form.

Variant 2 — Same Direction (Sliding / Fast-Slow)

Both pointers start at the same end. One moves fast, the other moves slowly based on a condition. Used for removing duplicates, finding subarrays, or detecting cycles in linked lists.

Variant 3 — Two Arrays / Strings

One pointer in each of two separate arrays or strings, both moving forward. Used for merging, comparing, or intersecting sequences.

`plaintext
VARIANT 1 — CONVERGING (Opposite Ends)

Array: [ 1, 3, 5, 7, 9, 11, 15 ]
↑ ↑
left right

sum too small → move left →
sum too large → move right ←
converge until they meet

VARIANT 2 — SAME DIRECTION (Fast + Slow)

Array: [ 1, 1, 2, 3, 3, 4, 5 ]

slow

fast

fast scans; slow marks the “write position”
fast overtakes slow when it finds a unique value

VARIANT 3 — TWO ARRAYS

A: [ 1, 3, 5, 7 ] B: [ 2, 3, 6, 8 ]
↑ ↑
pA pB

advance the pointer pointing at the smaller value
`

How Does It Actually Work? The Core Mechanics

Let’s anchor the intuition with the most foundational Two Pointers problem: the Two Sum II problem on a sorted array. Given a sorted array and a target, find the indices of two numbers that add up to the target.

Here’s the step-by-step logic for the converging variant:

  1. Set left = 0 (beginning of array), right = n - 1 (end of array).
  2. Compute sum = array[left] + array[right].
  3. If sum == target → you found it. Return the indices.
  4. If sum < target → the sum is too small. To increase it, move left one step right (toward larger values).
  5. If sum > target → the sum is too large. To decrease it, move right one step left (toward smaller values).
  6. Repeat until left < right. If they cross without finding the pair, no pair exists.

Why does this work? Because the array is sorted, we have predictable directionality. When we say “move left to the right,” we know for certain we’re increasing the sum. When we move right to the left, we’re decreasing it. Without sorting, this guarantee collapses and Two Pointers won’t work in this form.

{% hint style=”tip” %}
💡 Pro Tip: The Two Pointers technique works because the data has a monotonic property — either sorted order or a directional constraint. Before reaching for Two Pointers, always ask: “Is there an ordering I can exploit?” If yes, you probably have a Two Pointers problem.
{% endhint %}

The time complexity drops from O(n²) (brute force nested loops) to O(n) — because in the worst case, the two pointers together traverse the array exactly once. Space complexity is O(1) — no extra data structure needed, just two integer variables.

Approach Time Complexity Space Complexity
Brute Force (nested loops) O(n²) O(1)
Hash Map O(n) O(n)
Two Pointers (sorted) O(n) O(1)

Two Pointers is often the sweet spot — matching the speed of hash maps with the space efficiency of brute force.

Code Walkthroughs — Java, Python, and C

Let’s implement the core Two Pointers problems in all three languages. We’ll cover three problems progressively: Two Sum II, Remove Duplicates from Sorted Array, and Container With Most Water. Each one teaches a different dimension of the pattern.

Problem 1 — Two Sum II (Converging Pointers)

Given a sorted array of integers and a target, return the 1-based indices of the two numbers that add up to the target.

Python

`python
def two_sum(numbers: list[int], target: int) -> list[int]:
left = 0 # start at the beginning
right = len(numbers) – 1 # start at the end

while left < right:             # loop until pointers meet
    current_sum = numbers[left] + numbers[right]

    if current_sum == target:
        # found! return 1-based indices as required
        return [left + 1, right + 1]

    elif current_sum < target:
        # sum is too small → need a larger left value
        left += 1

    else:
        # sum is too large → need a smaller right value
        right -= 1

return []   # no pair found (problem guarantees one exists, but good practice)

── Test ──

numbers = [2, 7, 11, 15]
target = 9
print(two_sum(numbers, target)) # Output: 1, 2

numbers2 = [1, 3, 4, 5, 7, 10, 11]
target2 = 9
print(two_sum(numbers2, target2)) # Output: [3, 4]

left=0(1), right=6(11) → 12 > 9 → right–

left=0(1), right=5(10) → 11 > 9 → right–

left=0(1), right=4(7) → 8 < 9 → left++

left=1(3), right=4(7) → 10 > 9 → right–

left=1(3), right=3(5) → 8 < 9 → left++

left=2(4), right=3(5) → 9 == 9 → return [3, 4] ✓

`

Java

`java
public class TwoSum {

public static int[] twoSum(int[] numbers, int target) {
    int left = 0;                    // pointer at the start
    int right = numbers.length - 1; // pointer at the end

    while (left < right) {
        int sum = numbers[left] + numbers[right];

        if (sum == target) {
            // return 1-based indices (problem requirement)
            return new int[]{left + 1, right + 1};
        } else if (sum < target) {
            left++;   // need bigger left value → move right
        } else {
            right--;  // need smaller right value → move left
        }
    }
    return new int[]{};  // no solution found
}

public static void main(String[] args) {
    int[] numbers = {2, 7, 11, 15};
    int[] result = twoSum(numbers, 9);
    System.out.println(result[0] + ", " + result[1]); // 1, 2

    int[] numbers2 = {1, 2, 3, 4, 4, 9, 56, 90};
    int[] result2 = twoSum(numbers2, 8);
    System.out.println(result2[0] + ", " + result2[1]); // 4, 5  (4+4=8)
}

}
`

C

`c

include

include

/* two_sum: finds two indices (1-based) whose values sum to target.
Writes result into out[0] and out[1]. Returns 1 on success, 0 on failure. /
int two_sum(int *numbers, int n, int target, int *out) {
int left = 0; /
start pointer /
int right = n – 1; /
end pointer */

while (left < right) {
    int sum = numbers[left] + numbers[right];

    if (sum == target) {
        out[0] = left + 1;   /* convert to 1-based */
        out[1] = right + 1;
        return 1;            /* success */
    } else if (sum < target) {
        left++;              /* too small → move left pointer right */
    } else {
        right--;             /* too large → move right pointer left */
    }
}
return 0;   /* not found */

}

int main(void) {
int numbers[] = {2, 7, 11, 15};
int n = sizeof(numbers) / sizeof(numbers[0]); /* compute length */
int out[2];

if (two_sum(numbers, n, 9, out)) {
    printf("Indices: %d, %dn", out[0], out[1]); /* Indices: 1, 2 */
}
return 0;

}
`

Problem 2 — Remove Duplicates from Sorted Array (Same Direction)

Given a sorted array, remove duplicates in-place. Return the length of the new array. This is the fast/slow pointer variant — one pointer writes, one pointer reads.

Python

`python
def remove_duplicates(nums: list[int]) -> int:
if not nums:
return 0

# 'slow' marks the last position we wrote a unique value to
slow = 0

# 'fast' scans every element ahead
for fast in range(1, len(nums)):
    if nums[fast] != nums[slow]:
        # found a new unique value — move slow forward and write it
        slow += 1
        nums[slow] = nums[fast]

# slow is now the index of the last unique element
# so the count of unique elements is slow + 1
return slow + 1

── Test ──

nums = [1, 1, 2, 2, 3, 4, 4, 5]
k = remove_duplicates(nums)
print(k) # 5
print(nums[:k]) # [1, 2, 3, 4, 5]

Trace:

fast=1: nums[1]=1 == nums[0]=1 → skip

fast=2: nums[2]=2 != nums[0]=1 → slow=1, nums[1]=2

fast=3: nums[3]=2 == nums[1]=2 → skip

fast=4: nums[4]=3 != nums[1]=2 → slow=2, nums[2]=3

fast=5: nums[5]=4 != nums[2]=3 → slow=3, nums[3]=4

fast=6: nums[6]=4 == nums[3]=4 → skip

fast=7: nums[7]=5 != nums[3]=4 → slow=4, nums[4]=5

return slow+1 = 5 ✓

`

Java

`java
public class RemoveDuplicates {

public static int removeDuplicates(int[] nums) {
    if (nums.length == 0) return 0;

    int slow = 0;  // "write" pointer — tracks last unique element position

    for (int fast = 1; fast < nums.length; fast++) {
        // 'fast' is the "read" pointer — scans everything
        if (nums[fast] != nums[slow]) {
            slow++;               // advance write head
            nums[slow] = nums[fast];  // write the new unique value
        }
        // if equal, fast just keeps moving and slow stays put
    }

    return slow + 1;  // number of unique elements
}

public static void main(String[] args) {
    int[] nums = {0, 0, 1, 1, 1, 2, 2, 3, 3, 4};
    int k = removeDuplicates(nums);
    System.out.println("Unique count: " + k);  // 5
    // First k elements of nums are now: [0, 1, 2, 3, 4]
}

}
`

C

`c

include

/* remove_duplicates: modifies array in place.
Returns the count of unique elements. */
int remove_duplicates(int *nums, int n) {
if (n == 0) return 0;

int slow = 0;   /* write pointer: last position of a confirmed unique value */

for (int fast = 1; fast < n; fast++) {
    /* fast is the read pointer: advances every iteration */
    if (nums[fast] != nums[slow]) {
        slow++;
        nums[slow] = nums[fast];  /* overwrite with the new unique value */
    }
}

return slow + 1;  /* total unique elements */

}

int main(void) {
int nums[] = {1, 1, 2, 3, 3, 4, 4, 5};
int n = sizeof(nums) / sizeof(nums[0]);
int k = remove_duplicates(nums, n);

printf("Unique count: %dn", k);    /* 5 */
printf("Array: ");
for (int i = 0; i < k; i++) {
    printf("%d ", nums[i]);         /* 1 2 3 4 5 */
}
printf("n");
return 0;

}
`

Problem 3 — Container With Most Water (Converging + Greedy Decision)

Given an array where each element represents a vertical line’s height, find two lines that together with the x-axis form a container holding the most water. This is a brilliant problem because the Two Pointers move isn’t arbitrary — it’s driven by a greedy insight.

🔍 Fun Fact: The greedy insight: The area of water is constrained by the shorter line. If we move the pointer at the taller line inward, we can only shrink the width — so the area can only stay the same or decrease. But if we move the shorter line, there’s a chance of finding a taller line that improves the area. So we always move the shorter pointer.

Python

`python
def max_area(height: list[int]) -> int:
left = 0
right = len(height) – 1
max_water = 0

while left < right:
    # width = distance between the two lines
    width = right - left

    # height is limited by the shorter of the two lines
    current_height = min(height[left], height[right])

    # compute and update max area
    max_water = max(max_water, width * current_height)

    # greedy move: discard the shorter line by moving its pointer inward
    # rationale: keeping the shorter line can never improve the area
    if height[left] < height[right]:
        left += 1
    else:
        right -= 1

return max_water

── Test ──

print(max_area([1, 8, 6, 2, 5, 4, 8, 3, 7])) # 49

The best container is between index 1 (height=8) and index 8 (height=7)

width = 8-1 = 7, height = min(8,7) = 7, area = 49

`

Java

`java
public class MaxWater {

public static int maxArea(int[] height) {
    int left = 0;
    int right = height.length - 1;
    int maxWater = 0;

    while (left < right) {
        int width  = right - left;
        int h      = Math.min(height[left], height[right]);
        int area   = width * h;

        maxWater = Math.max(maxWater, area);

        // move the shorter wall's pointer inward
        if (height[left] < height[right]) {
            left++;
        } else {
            right--;
        }
    }

    return maxWater;
}

public static void main(String[] args) {
    int[] height = {1, 8, 6, 2, 5, 4, 8, 3, 7};
    System.out.println(maxArea(height));  // 49
}

}
`

C

`c

include

/* Helper macros */

define MIN(a, b) ((a) < (b) ? (a) : (b))

define MAX(a, b) ((a) > (b) ? (a) : (b))

int max_area(int *height, int n) {
int left = 0;
int right = n – 1;
int max_water = 0;

while (left < right) {
    int width = right - left;
    int h     = MIN(height[left], height[right]);
    int area  = width * h;

    max_water = MAX(max_water, area);

    /* always move the shorter wall inward */
    if (height[left] < height[right]) {
        left++;
    } else {
        right--;
    }
}
return max_water;

}

int main(void) {
int height[] = {1, 8, 6, 2, 5, 4, 8, 3, 7};
int n = sizeof(height) / sizeof(height[0]);
printf(“Max water: %dn”, max_area(height, n)); /* 49 */
return 0;
}
`

Common Pitfalls (I’ve Hit Every Single One)

Pitfall 1 — Using Two Pointers on an Unsorted Array

The converging pointer technique only works on sorted arrays for sum/pair problems. When I first learned Two Pointers, I tried applying it to an unsorted array and got wrong answers I couldn’t debug for hours. The core logic — “move left to increase the sum, move right to decrease it” — relies entirely on sorted order. If the array isn’t sorted, sort it first (paying the O(n log n) cost), or switch to a hash map approach.

⚠️ Watch Out! Never assume the array is sorted just because it “looks” sorted in your test case. Always check or sort explicitly. One test case with a random array will expose the bug immediately.

Pitfall 2 — Off-by-One Errors with the Loop Condition

The loop condition while (left < right) vs while (left <= right) matters enormously. Using <= can cause you to use the same element twice (pairing an element with itself). For all two-element pair problems, use strict <. I’ve cost myself points in contests over this exact character.

Pitfall 3 — Moving Both Pointers When You Should Move One

When your sum equals the target, you’ve found your answer — return it, don’t keep moving. But in related problems (like finding all pairs, or 3Sum), after finding a match you might move both pointers. The mistake is moving only one. When you find a valid pair and need to continue searching, move both: left++; right--; Otherwise you get duplicate results.

Pitfall 4 — Forgetting to Handle the Empty Array or Single Element

In C especially, accessing array[-1] or going out of bounds is undefined behaviour — it can corrupt memory silently and crash in the most confusing ways. Always check if (n == 0 || n == 1) return ... before initialising pointers. Python and Java throw exceptions; C just detonates quietly.

Pitfall 5 — Applying the Converging Pattern to the Wrong Problem Shape

Two Pointers is not always the right tool. If the problem involves a subarray or substring with no clear sorted property but a size constraint, you likely want a sliding window instead. Sliding window is Two Pointers’ cousin — one pointer marks the start of a window, the other expands or contracts it. Two Pointers without sorting often becomes a sliding window problem in disguise.

💡 Pro Tip — Quick decision tree:

  • Sorted array + pair sum? → Converging Two Pointers
  • Subarray with running condition? → Sliding Window (fast/slow variant)
  • Merging two sorted arrays? → Two-array Two Pointers

Hands-On Challenge — Three Problems to Cement It

Reading about patterns is good. Writing them from scratch is what actually wires the pattern into your brain. Here are three problems in increasing difficulty. Try each one before looking at the solution.

Challenge 1 — Valid Palindrome (Easy)

Given a string, check if it is a palindrome considering only alphanumeric characters and ignoring case.

Input: "A man, a plan, a canal: Panama"

Output: true

Hint: Put one pointer at the start, one at the end. Skip non-alphanumeric characters. Compare characters (case-insensitive). If they ever differ, return false. If the pointers cross, return true.

Python — Solution

`python
def is_palindrome(s: str) -> bool:
left, right = 0, len(s) – 1

while left < right:
    # skip non-alphanumeric from left
    while left < right and not s[left].isalnum():
        left += 1
    # skip non-alphanumeric from right
    while left < right and not s[right].isalnum():
        right -= 1

    # case-insensitive character comparison
    if s[left].lower() != s[right].lower():
        return False

    left += 1
    right -= 1

return True

print(is_palindrome(“A man, a plan, a canal: Panama”)) # True
print(is_palindrome(“race a car”)) # False
`

Challenge 2 — 3Sum (Medium)

Given an array of integers, find all unique triplets that sum to zero.

Input: [-1, 0, 1, 2, -1, -4]

Output: [[-1, -1, 2], [-1, 0, 1]]

Hint: Sort the array. Fix one element at index i using an outer loop. Then run Two Pointers (left = i+1, right = n-1) to find pairs that sum to -nums[i]. After finding a valid triplet, skip duplicates by moving pointers past equal values.

Python — Solution

`python
def three_sum(nums: list[int]) -> list[list[int]]:
nums.sort() # crucial: sort first
result = []

for i in range(len(nums) - 2):
    # skip duplicate values for the fixed element
    if i > 0 and nums[i] == nums[i - 1]:
        continue

    left  = i + 1
    right = len(nums) - 1

    while left < right:
        total = nums[i] + nums[left] + nums[right]

        if total == 0:
            result.append([nums[i], nums[left], nums[right]])

            # skip duplicates on both sides
            while left < right and nums[left] == nums[left + 1]:
                left += 1
            while left < right and nums[right] == nums[right - 1]:
                right -= 1

            # move both pointers after recording the triplet
            left += 1
            right -= 1

        elif total < 0:
            left += 1   # need bigger value
        else:
            right -= 1  # need smaller value

return result

print(three_sum([-1, 0, 1, 2, -1, -4])) # [[-1, -1, 2], [-1, 0, 1]]
`

Challenge 3 — Trapping Rain Water (Hard)

Given an array where each element is the height of a bar, compute how much rainwater can be trapped between bars.

Input: [0, 1, 0, 2, 1, 0, 1, 3, 2, 1, 2, 1]

Output: 6

Hint: Use Two Pointers with two running max values: left_max and right_max. At each step, process the side with the smaller max. Water at a position = max_height_on_that_side − current_height. This avoids needing prefix/suffix max arrays.

Python — Solution

`python
def trap(height: list[int]) -> int:
left, right = 0, len(height) – 1
left_max = right_max = 0
water = 0

while left < right:
    if height[left] < height[right]:
        # process left side — right side is guaranteed taller
        if height[left] >= left_max:
            left_max = height[left]   # update max, no water trapped here
        else:
            water += left_max - height[left]   # water fills the gap
        left += 1
    else:
        # process right side — left side is guaranteed taller
        if height[right] >= right_max:
            right_max = height[right]
        else:
            water += right_max - height[right]
        right -= 1

return water

print(trap([0, 1, 0, 2, 1, 0, 1, 3, 2, 1, 2, 1])) # 6
`

💡 Pro Tip: Trapping Rain Water is a classic interview question at top-tier companies. If you can implement this cleanly in an interview, you immediately signal strong algorithmic thinking. Practise explaining the why behind each pointer movement — that’s what interviewers care about.

⚡ Recap — Key Takeaways

  • Two Pointers eliminates nested loops — drops O(n²) problems to O(n) time with O(1) space.
  • Three variants: converging (opposite ends), fast/slow (same direction), two-array (one pointer per sequence).
  • Sorted order is the prerequisite for converging pointers — always sort first if needed.
  • Move the pointer that can improve your answer — in sum problems, move toward more favourable values; in max-area problems, discard the shorter wall.
  • Loop condition is left < right (strict) to avoid pairing an element with itself.
  • Fast/slow pointers are perfect for in-place array modifications — slow writes, fast reads.
  • 3Sum = outer loop + Two Pointers — this generalises to k-Sum problems.
  • Trapping Rain Water is the hardest Two Pointers application — master it and you’ve mastered the pattern.
  • Know when NOT to use it — unsorted + no ordering property → use hash map; running window condition → use sliding window.

What’s Next? — Your Learning Path

Two Pointers is one node in a larger graph of algorithmic patterns. Here’s where to go from here:

Immediate Next: Sliding Window

Sliding Window is the fast/slow variant taken to its full form. Problems: Longest Substring Without Repeating Characters, Minimum Size Subarray Sum, Longest Repeating Character Replacement. Once you know Two Pointers, Sliding Window is about a day’s work to learn.

Then: Binary Search

Binary Search and Two Pointers are siblings — both exploit sorted order. Many problems can be solved with either, and knowing both lets you pick the more elegant solution. Practise on: Search in Rotated Sorted Array, Find Minimum in Rotated Sorted Array, Koko Eating Bananas.

Levelling Up: Linked List Two Pointers

The fast/slow pointer idea applies beautifully to linked lists — Floyd’s Cycle Detection algorithm uses it to find cycles in O(1) space. Problems: Linked List Cycle, Find Middle of Linked List, Happy Number.

Problem Bank to Grind

  • Easy: Valid Palindrome, Move Zeroes, Squares of Sorted Array, Two Sum II
  • Medium: 3Sum, Container With Most Water, Remove Duplicates, Sort Colors (Dutch Flag)
  • Hard: Trapping Rain Water, 4Sum, Minimum Window Substring (sliding window)

Resources

  • LeetCode — search “Two Pointers” tag; filter by difficulty
  • NeetCode.io — free roadmap with Two Pointers section and video explanations
  • “Introduction to Algorithms” (CLRS) — Chapter on sorting gives you the sorted-array foundation
  • Grokking the Coding Interview — has a dedicated Two Pointers pattern chapter

“The best algorithm is the one you understand well enough to debug at 2 AM under interview pressure. Know the why behind every pointer move — not just the code.”

We started from a nested loop embarrassment and arrived at trapping rainwater with a single linear pass. That’s the journey of learning this pattern — from brute force to elegance, from O(n²) to O(n). Every problem you solve with Two Pointers builds the instinct to spot the next one faster. Go grind those LeetCode problems, share your solutions with the crew, and let me know when you get Trapping Rain Water on the first try. That’s a milestone worth celebrating.

Happy coding. See you in the next one. 🚀

Written for developers who learn best by doing · Two Pointers Pattern · Java · Python · C

Share this with a friend who’s still writing nested loops 😄
`

I built 342 browser-native dev tools – here’s why everything runs client-side

I built 342 browser-native dev tools — here’s why everything runs client-side

Over the last few months I quietly built ZeroServer.tools — a suite of 342 free developer utilities. JSON formatter, Base64 encoder, hashing, JWT decoder, regex tester, image compressor, PDF tools, CSS generators, calculators, converters, and a lot more.

But the design constraint I imposed from day one is what made this interesting to build: every single tool runs 100% in your browser. No backend. No server calls. Nothing you type ever leaves your device.

Here’s why I made that choice — and how the architecture works.

The problem I kept running into

Whenever I needed a quick tool — format this JSON, encode this string, generate a hash — I’d find a site, paste my data in, and get my result.

But I’d always wonder: where did my data go? Most of these tools have a backend. Your JSON goes to their server. Your JWT (with its payload claims) gets decoded server-side. Your password gets “strength-checked” by someone else’s API.

Most sites are fine. But I was using them for work — sometimes with API keys, sometimes with internal JSON structures I probably shouldn’t paste into random websites.

So I built one where that problem doesn’t exist.

The architecture: Next.js static export on Cloudflare Pages

The entire site is a Next.js static export (output: "export" in next.config.ts). No server-side rendering, no API routes, no Edge Functions. Just static HTML + JS files that Cloudflare serves from the edge.

// next.config.ts
const nextConfig: NextConfig = {
  output: "export",
  trailingSlash: true,
};

This has a real constraint: you can’t do anything server-side. No fs, no fetch at build-time for dynamic data, no crypto from Node. Everything that runs at request time has to be browser JavaScript.

That constraint turned out to be the right forcing function.

How browser-native APIs cover 95% of what you need

I expected to write a lot of polyfills. I didn’t. The browser has gotten remarkably capable:

Cryptography — Web Crypto API

// SHA-256 hash of any string
const buf = await crypto.subtle.digest(
  "SHA-256",
  new TextEncoder().encode(input)
);
const hex = Array.from(new Uint8Array(buf))
  .map(b => b.toString(16).padStart(2, "0"))
  .join("");

No Node crypto module needed. SHA-1, SHA-256, SHA-384, SHA-512 are all built into crypto.subtle. HMAC, PBKDF2, AES-GCM — all there.

File operations — FileReader + Canvas API
The image tools (compressor, resizer, format converter, watermark, EXIF stripper) all use FileReader to read files and HTMLCanvasElement to process them. The PDF tools use pdf-lib compiled to WebAssembly — runs entirely in the browser.

// Image compression — client-side only
const img = new Image();
img.onload = () => {
  const canvas = document.createElement("canvas");
  canvas.width = img.width;
  canvas.height = img.height;
  const ctx = canvas.getContext("2d")!;
  ctx.drawImage(img, 0, 0);
  canvas.toBlob((blob) => {
    // blob is the compressed image — never touched a server
  }, "image/webp", quality);
};
img.src = URL.createObjectURL(file);

Random/UUID — crypto.randomUUID()

const id = crypto.randomUUID(); // cryptographically random, browser-native

The SSR problem — and how I solved it

Next.js does server-side rendering at build time (for static export). This means your component renders once in Node.js at build time, then hydrates in the browser.

Problem: anything that uses window, document, FileReader, Date.now(), or Math.random() will crash the build or produce hydration mismatches.

The solution I settled on:

  1. Pure logic in useMemo — formatting, parsing, encoding, converting. These are deterministic: same input → same output. No SSR problems.

  2. Side effects in useEffect — anything that touches the DOM, reads files, or uses browser-only APIs. This runs client-side only.

  3. Seeded PRNGs for “random” output — for tools like the Lorem Ipsum generator and quote picker, I use a deterministic seeded PRNG so the SSR output matches the client render:

function prng(seed: number): number {
  return Math.abs(Math.sin(seed + 1.618) * 1e8) % 1;
}

Math.sin with a fixed seed is pure — same result in Node and browser. No hydration mismatch.

  1. new Date(string) is safe, new Date() is notnew Date() returns the current time, which differs between build-time Node and client runtime. new Date("2025-01-01T00:00:00Z") is deterministic and safe anywhere.

The registry-driven architecture

All 342 tools share a single source of truth: lib/tools.ts. Every nav item, dashboard card, search result, sitemap entry, and SEO metadata is derived from this registry. Adding a new tool is:

  1. Create app/<slug>/layout.tsx (SEO metadata)
  2. Create app/<slug>/page.tsx (the tool)
  3. Add one entry to lib/tools.ts
  4. Add chain links to lib/toolChains.ts (the “try also” suggestions)

No config files, no separate sitemap file to maintain, no duplicate metadata.

What’s in the stack

  • Next.js 16 with App Router, static export
  • React 19
  • Tailwind CSS v4 (JIT, no config file)
  • Lucide React for icons (one icon per tool)
  • Cloudflare Pages for hosting (generous free tier, global CDN)
  • No database, no auth, no backend

The tools people use most

Based on the categories that get the most traffic from search:

  1. JSON Formatter — the classic, still drives a lot of traffic
  2. Image Compressor — “files never leave your browser” resonates for images
  3. PDF tools — merge, split, rotate, all client-side (pdf-lib/WASM)
  4. Hash generators — SHA-256, MD5, HMAC
  5. Base64 — evergreen for devs

The privacy angle matters most for crypto/hashing and image/PDF tools, where people are most sensitive about where their files go.

Try it

zeroserver.tools — 342 tools, 100% client-side, free, no signup.

I’d genuinely love feedback — especially:

  • Which tools feel off or inaccurate?
  • What’s missing that you reach for regularly?
  • Does the “nothing leaves your browser” claim come through clearly?

I’m building this in the open and taking requests.

I tested whether a code health score actually predicts bugs. Here’s the benchmark

Most code health scores are vibes. A number goes up, a number goes down, and nobody checks whether the files it flags are the files that actually break later. I wanted to know if the score I built does better than that, so I ran it against a defect corpus and put it head to head with the leading commercial code-health tool.

On the same 2,770 files across 9 languages, scored at the same leakage-free commit against the same defect labels, the score surfaces 2.3x the defects under a fixed review budget.

This post is how that works, and the four other layers sitting next to it in repowise.

What the score is

Every file gets a 1 to 10 score from 25 deterministic biomarkers. McCabe complexity, deep nesting, brain methods, class cohesion (LCOM4), god classes, native Rabin-Karp clone detection, untested hotspots, function-level churn, code-age volatility, ownership dispersion, change entropy, co-change scatter, prior-defect history, test-quality smells, and more.

No LLM calls. No cloud. No new runtime dependency. It is pure Python over tree-sitter and git data, and it finishes in under 30 seconds on a 3,000-file repo.

repowise health                       # KPIs + lowest-scoring files
repowise health --coverage cov.lcov   # ingest LCOV/Cobertura/Clover
repowise health --refactoring-targets # ranked by impact / effort
repowise health --trend               # snapshots + declining alerts

The biomarker weights are calibrated against a real defect corpus instead of hand-tuned. Only the learned constants ship. The runtime itself stays fully deterministic, so the same file produces the same score every time.

Does the score find bugs

The validation setup avoids the usual leakage trap. Health scores are collected at a historical commit (call it T0). Bug-fixing commits are counted over the following 6 months. Then the two get correlated. The score never sees the future it is being graded on.

Across 21 open-source repositories spanning all 9 Full-tier languages:

  • Cross-project mean ROC AUC of 0.74 [95% CI 0.68 to 0.79] at identifying files that go on to receive bug fixes. Up to 0.90 on individual repos.
  • It survives controlling for file size (partial Spearman rho = -0.16). It is not just flagging the big files.
  • It out-discriminates recent churn by +0.10 AUC and prior-defect history by +0.12 AUC, DeLong p < 1e-9.
  • It holds on an external published dataset it has never seen (PROMISE/jEdit CK-metrics: AUC 0.76 to 0.78, within about 0.03 of that dataset’s own tuned model).

Head to head

Same files, same commit, same labels, paired tests against the leading commercial code-health tool:

Axis repowise Commercial tool
Recall @ 20%-of-lines budget 0.173 0.074
Effort-aware ranking (Popt) 0.607 0.462
Defect density, size-normalized (Alert:Healthy) 2.18x 0.56x
Discrimination (ROC AUC) 0.731 0.705

Ranking files by repowise health surfaces 2.3x the defects under a fixed review budget. Popt delta +0.144, recall delta +0.098, density delta p = 0.003, all paired and significant.

Full methodology and CIs.

The four other layers

Code health is one of five. The point of the other four is that your AI coding agent reads files but knows nothing about how the codebase got there.

Graph. A real tree-sitter dependency graph across 15 languages. File and symbol nodes, 3-tier call resolution, Leiden communities, PageRank, framework-aware route-to-handler edges.

Git. Behavioral signals static analysis cannot see. Hotspots (churn times complexity), ownership percentages, co-change pairs that expose hidden coupling, bus factor, reviewer suggestions.

Docs. An LLM-generated wiki per module and file, rebuilt incrementally on every commit with freshness and confidence scoring, searchable via hybrid RAG (full-text plus vector through reciprocal rank fusion).

Decisions. Architectural decisions mined from 8 sources, evidence-backed, linked to graph nodes, connected by supersedes / refines / conflicts_with edges. This is the layer most tools capture nowhere.

The agent angle

All five layers expose through nine MCP tools shaped around tasks, not data entities. You pass multiple files or symbols in one call and get complete context back, instead of chaining 30 greps and reads.

Paired SWE-QA runs on real repos, same model and harness, with and without the MCP tools:

  • 70% fewer tool calls
  • 89% fewer file reads
  • 36% lower cost per query
  • answer quality at parity

Feeding an agent a commit through get_context costs 2,391 tokens versus 64,039 for the raw changed files. About 27x fewer.

There is also repowise distill, which compresses noisy command output before the agent reads it, errors first, every omission reversible:

Command Raw to distilled tokens Saved
pytest -q (11 failures) 3,374 to 1,317 61%
git log -50 3,064 to 331 89%
git diff (30 commits) 62,833 to 8,635 86%

Try it

pip install repowise
cd your-project
repowise init        # builds all five layers
repowise serve       # MCP server + local dashboard

The graph, git, dead-code, and health layers build in minutes with zero LLM calls. Run repowise init --index-only for a queryable index almost immediately. After that, every commit-triggered update takes under 30 seconds and only regenerates the pages your change touched.

100% local, bring your own API key, AGPL-3.0.

Repo, benchmarks, and live demo: github.com/repowise-dev/repowise and repowise.dev.

If you run the health-defect benchmark on your own repos, I want to see the numbers. The whole harness is public so you can reproduce or break it.