SoloEngine: How to Let AI Run Every Industry

As someone with three years of experience in large language model algorithms, agent development, and knowledge base construction, I’ve recently had a thought: Vibe Coding has emerged in the programming industry simply because programmers know how to write code. Other industries don’t have Cursor or Claude Code, not because they lack the need for Agentic AI, but because they don’t use LangChain or CrewAI. I wanted to build a tool that lowers the barrier to Agentic AI development to the same simplicity as workflow tools like Dify. Thus, SoloEngine was born.

SoloEngine, as the first low‑code Agentic AI development platform, fully encapsulates mechanisms such as ReAct, Tool, MCP, Skill, and SubAgent into backend services. When using it, you simply drag an agent onto the canvas, connect collaboration relationships, configure the required tools, and click Run. The backend then automatically compiles everything into your very own Claude Code — planning, execution, and delivery are all autonomously completed by the agent.

Comparison: SoloEngine vs Other Solutions

Feature Dify, n8n, Zapier LangChain, CrewAI, LangGraph SoloEngine
Agentic AI ✗ Scripted workflows only ✓ ReAct / Multi‑Agent ✓ ReAct / Multi‑Agent
No coding required ✗ Python mandatory
Visual orchestration Partial support ✓ Full canvas experience
Domain experts can build independently ✓ (but workflows are not truly Agentic)
Multi‑agent collaboration

Core Design

For compilation efficiency, all agent nodes adopt a unified ReAct architecture. The platform parses superior‑subordinate relationships through topology, enabling connections and SubAgent calls. The visual design on the canvas is directly compiled into an executable agent team.

At runtime, each agent employs progressive disclosure, loading only the MCPs and Skills it needs on demand — token consumption can be reduced by over 85%.

On the model side, SoloEngine covers commonly used AI models such as OpenAI, Anthropic, Ollama, DeepSeek, Qwen, and Zhipu — a unified interface for seamless switching.

Release Updates

After more than a dozen development iterations, the v0.2 file change tracking and rollback mechanism has been released and is relatively stable. An official release build will be available soon. v0.3‘s one‑click deployment feature for Agentic AI is in its final stages, allowing compiled agent teams to be packaged as standalone products for self‑deployment or distribution and sales. Meanwhile, long‑term memory and autonomous evolution are also on the roadmap.

Quick Start

git clone https://github.com/Sh4r1ock/SoloEngine.git
cd SoloEngine

# Backend (Python 3.11+)
cd backend
pip install -r requirements.txt
python main.py

# Frontend (Node.js 18+) — run in another terminal
cd frontend 
npm install
npm run dev

Open http://localhost:8991 to build your first agent team.

Get Involved

The project is currently in a phase of rapid iteration. More participants are welcome to help AI drive every industry. We hope that in the future, AI will evolve from Vibe Coding into Vibe Everything.

Project repository: https://github.com/Sh4r1ock/SoloEngine

TLS Fingerprinting: How JA3 and JA4 Identify You Before You Send a Byte

Encryption hides the contents of your HTTPS connection — but the negotiation that sets up that encryption happens in the clear. The very first message your client sends, before a single byte of application data, has a distinctive shape. JA3 and JA4 turn that shape into a fingerprint that can identify your software, and sometimes route, throttle, or block you on the spot.

Every HTTPS connection starts with a TLS handshake, and the handshake starts with a message called the ClientHello. It is sent unencrypted, because the two sides have not yet agreed on a key. Inside it, your client announces everything it is willing to do: which TLS versions it supports, which cipher suites it prefers and in what order, which extensions it understands, which elliptic curves and signature algorithms it offers.

None of that is secret. None of it has to be. But taken together, the exact set and ordering of those parameters is remarkably specific to a particular piece of software at a particular version. Chrome 124 produces a different ClientHello from Firefox, which produces a different one from Python’s requests library, which differs from Go’s standard library, which differs from a curl built against a specific OpenSSL version. TLS fingerprinting is the practice of hashing that ClientHello into a short, stable identifier and looking it up.

What Goes Into the Fingerprint

The original technique, JA3, was published by three engineers at Salesforce in 2017 — John Althouse, Jeff Atkinson, and Josh Atkins, whose initials gave it the name. JA3 builds a string from five fields of the ClientHello, in order:

  • The TLS version offered
  • The list of cipher suites
  • The list of extensions
  • The list of supported elliptic curves (named groups)
  • The list of elliptic-curve point formats

Each field is rendered as its numeric values joined by hyphens, the fields are joined by commas, and the whole string is hashed with MD5 to produce a 32-character fingerprint. A companion technique, JA3S, does the same for the server’s ServerHello, so you can fingerprint both ends of a conversation. Pairing a client JA3 with a server JA3S is a common way to identify specific malware command-and-control channels, because the malware and its server both produce consistent, unusual hashes.

Why ordering matters: Two clients can support the exact same cipher suites and still fingerprint differently, because they offer them in a different preference order. That ordering is baked into the TLS library and rarely changes between builds — which is exactly what makes it a stable signal.

Why JA3 Started to Break

JA3 worked well for years, but two developments eroded it. The first was GREASE (RFC 8701), a mechanism Google introduced to keep the TLS ecosystem flexible. GREASE makes clients insert random reserved values into their cipher and extension lists, so that middleboxes don’t hard-code assumptions about what they see. The side effect is that a naive JA3 implementation produces a different hash on every connection unless it explicitly strips the GREASE values out.

The second was TLS 1.3 and the rise of extension shuffling. Chrome began randomizing the order of some ClientHello extensions on each connection specifically to discourage fingerprinting and ossification. Against a technique that depends on extension ordering, that is fatal: the same browser now yields many different JA3 hashes.

JA4: The Redesign

In 2023, John Althouse — one of the original JA3 authors, now at FoxIO — released JA4, the centerpiece of a broader suite called JA4+ that fingerprints not just TLS but HTTP, TCP, SSH, and more. JA4 was designed to survive the things that broke JA3.

The biggest structural change is that JA4 is partly human-readable. Instead of one opaque MD5, a JA4 fingerprint is divided into sections you can read at a glance:

  • A prefix describing the transport and TLS version, whether SNI is present, the count of cipher suites, the count of extensions, and the first ALPN value — for example, whether the client is speaking HTTP/2 or HTTP/1.1
  • A truncated hash of the cipher suites, sorted numerically so that order-shuffling no longer changes the result
  • A truncated hash of the extensions and signature algorithms, also handled so that cosmetic reordering doesn’t matter

GREASE values are stripped by definition. Because the cipher and extension lists are sorted before hashing, Chrome’s randomization no longer produces a moving target. The result is a fingerprint that is both more stable than JA3 and more informative, because a human analyst can read meaningful structure out of the prefix without consulting a lookup table.

Property JA3 (2017) JA4 (2023)
Output Single MD5 hash Structured, partly human-readable
Handles GREASE Only if implementation strips it Yes, by design
Survives extension shuffling No — order-dependent Yes — lists are sorted
Scope TLS ClientHello / ServerHello TLS, HTTP, TCP, SSH and more (JA4+)

Who Uses This, and For What

TLS fingerprinting is genuinely dual-use. On the defensive side, it is one of the more useful tools a network operator has. A fingerprint that claims to be Chrome in its User-Agent header but whose ClientHello matches Python’s requests is almost certainly a bot lying about itself. Security teams use JA3/JA4 to spot malware beaconing, to cluster automated traffic, and to flag scrapers that don’t match any real browser. Because the fingerprint is computed from bytes the client cannot easily fake without rebuilding its TLS stack, it is harder to spoof than a header.

That same strength is what makes it a censorship and tracking tool. A national firewall or a corporate middlebox can fingerprint every outbound connection and treat traffic differently based on what software produced it — throttling or blocking a circumvention tool whose handshake doesn’t look like a mainstream browser, even though it cannot read the encrypted payload. Anti-bot vendors and CDNs fingerprint connections to decide who gets served and who gets a challenge. The fingerprint becomes a passive selector applied before you have proven anything about who you are.

The encryption is doing its job perfectly. The leak is in the envelope, not the letter — and the envelope is, by necessity, written in the clear.

Can You Defend Against It?

Not cleanly, and that is the uncomfortable part. Because the fingerprint is derived from how your TLS library behaves, the only thorough defense is to make your traffic produce a common, unremarkable fingerprint — to look like everyone else. Circumvention tools increasingly do exactly this through uTLS, a Go library that lets a client mimic the precise ClientHello of a mainstream browser, GREASE and ordering included, so its JA3/JA4 blends into the crowd.

For an ordinary user, the practical reality is simpler: using a current, mainstream browser is itself a form of crowd-blending, because millions of others produce a near-identical handshake. The danger zone is unusual software — a custom client, an old library, a niche tool — that produces a rare fingerprint precisely because few others share it. This is the same logic that governs browser fingerprinting at the application layer: distinctiveness is the vulnerability, and the anonymity set is the defense.

The Broader Lesson

TLS fingerprinting is a clean illustration of a pattern that runs through nearly all privacy engineering: encrypting the contents of a channel does not hide the channel’s metadata, and the metadata is often enough. The handshake has to be in the clear so two strangers can agree on a key. The shape of that handshake leaks the identity of the software making it. No amount of payload encryption closes that gap, because the gap exists before encryption begins.

The honest takeaway is not that TLS is broken — it isn’t — but that “the connection is encrypted” answers a narrower question than most people think. Knowing what your tools reveal in the clear, and choosing tools whose visible behavior is common rather than distinctive, is the part of the threat model that fingerprinting forces you to take seriously.

Originally published at havenmessenger.com

RAG with Postgres pgvector in 2026: the full TypeScript pipeline.

RAG with Postgres pgvector in 2026: the full TypeScript pipeline.

I spent a week evaluating dedicated vector databases before deciding to just use the Postgres instance I already had. The pgvector extension handles similarity search well enough for most production workloads, and it collapses three infrastructure components into one. This walkthrough covers everything from schema to answer: chunk your docs, embed them, store in pgvector, retrieve by cosine similarity, and wire the results into an LLM call.

TL;DR

Step Tool Why
Enable vector store pgvector 0.8.x, HNSW index Runs in your existing Postgres, no extra infra
Embed text-embedding-3-small (1,536 dims) $0.02 per million tokens, fast
Query <=> cosine distance, top-k Works with both OpenAI and Voyage models
Augment Claude or GPT-4o with retrieved docs Context window stuffed, hallucination rate drops

1. Why pgvector instead of a dedicated vector database

Pinecone and Weaviate are good products. If you need multi-tenant isolation, sub-millisecond p99 at 100M+ vectors, or native hybrid search with BM25, they earn their place. For most teams, those are future problems.

The cost calculus changes when you consider ops burden. A dedicated vector DB means a new billing line, a new set of credentials to rotate, a new failure mode to track, and a new SDK to keep current in your application. pgvector runs as a Postgres extension: one connection string, one backup strategy, one source of truth. At 10M documents with 1,536-dimensional embeddings, an HNSW index on a reasonably sized Postgres instance returns top-10 results in under 10ms. That covers the overwhelming share of RAG use cases.

pgvector 0.8.0 added iterative HNSW scans. That release made filtered similarity search practical without falling back to sequential scans every time a WHERE clause got specific. The 0.8.0 release was what tipped my team from “maybe later” to “ship it.”

2. Schema setup

Enable the extension once per database, then create your table.

-- enable pgvector (run once per database)
CREATE EXTENSION IF NOT EXISTS vector;

-- documents table
CREATE TABLE documents (
  id         BIGSERIAL PRIMARY KEY,
  source     TEXT NOT NULL,          -- filename, URL, or ID of source doc
  chunk_idx  INT NOT NULL,           -- chunk number within the source
  content    TEXT NOT NULL,          -- raw text of the chunk
  embedding  vector(1536) NOT NULL,  -- OpenAI text-embedding-3-small
  created_at TIMESTAMPTZ DEFAULT NOW()
);

Choosing between HNSW and IVFFlat

HNSW builds a navigable small-world graph. Queries scan the graph instead of comparing all rows. Build once, query immediately. The tradeoff is that the index takes more memory: roughly 8 bytes per dimension per row for a 1,536-dim column at default settings.

IVFFlat partitions the embedding space into centroid clusters. Faster to build, smaller memory footprint, but you must load rows before building the index or the centroid assignment is useless. If you are starting from zero rows, build HNSW.

-- HNSW index (recommended default)
-- m = connections per layer (default 16), higher = better recall at higher memory cost
-- ef_construction = candidate list during build (default 64), higher = better recall at slower build
CREATE INDEX ON documents
  USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

-- IVFFlat alternative (only after loading rows)
-- lists = sqrt(row_count) is a good starting point for large tables
-- CREATE INDEX ON documents USING ivfflat (embedding vector_l2_ops) WITH (lists = 100);

Use vector_cosine_ops with the <=> operator when your embedding model normalizes vectors (OpenAI and Voyage both do). Use vector_l2_ops with <-> for raw Euclidean distance when vectors are not normalized. Use vector_ip_ops with <#> for inner product, which equals cosine similarity on normalized vectors and saves one normalization step.

3. Ingest pipeline in TypeScript

The ingest function chunks a document, calls the embedding API, and bulk inserts rows. Use postgres (the npm package, not pg) for its tagged-template SQL and native array support.

import postgres from "postgres";
import OpenAI from "openai";

const sql = postgres(process.env.DATABASE_URL!);
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });

const CHUNK_SIZE = 512;   // tokens, not characters
const CHUNK_OVERLAP = 64; // tokens of overlap between adjacent chunks

function chunkText(text: string, size: number, overlap: number): string[] {
  // naive word-boundary chunker — swap for tiktoken in production
  const words = text.split(/s+/);
  const chunks: string[] = [];
  let start = 0;
  while (start < words.length) {
    const end = Math.min(start + size, words.length);
    chunks.push(words.slice(start, end).join(" "));
    start += size - overlap;
  }
  return chunks;
}

async function embedBatch(texts: string[]): Promise<number[][]> {
  const response = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: texts,
  });
  return response.data.map((d) => d.embedding);
}

export async function ingestDocument(source: string, text: string): Promise<void> {
  const chunks = chunkText(text, CHUNK_SIZE, CHUNK_OVERLAP);

  // embed in batches of 100 (OpenAI max batch size)
  const BATCH = 100;
  for (let i = 0; i < chunks.length; i += BATCH) {
    const batch = chunks.slice(i, i + BATCH);
    const embeddings = await embedBatch(batch);

    const rows = batch.map((content, j) => ({
      source,
      chunk_idx: i + j,
      content,
      embedding: JSON.stringify(embeddings[j]),
    }));

    await sql`
      INSERT INTO documents (source, chunk_idx, content, embedding)
      SELECT
        r.source,
        r.chunk_idx::int,
        r.content,
        r.embedding::vector
      FROM jsonb_to_recordset(${JSON.stringify(rows)}::jsonb)
        AS r(source text, chunk_idx text, content text, embedding text)
    `;
  }

  console.log(`[ingest] ${source}: ${chunks.length} chunks stored`);
}

A note on chunk size: 512 words is a starting point. The right size depends on your source material. Legal documents with dense paragraphs do better at 256 words. Code files need at least 300 lines or you lose function context. The overlap prevents the embedding from missing a sentence that straddles a chunk boundary.

4. Query pipeline in TypeScript

Embed the user’s question, run a top-k cosine similarity search, return the matching chunks.

export async function queryDocuments(
  question: string,
  topK = 5,
): Promise<Array<{ source: string; content: string; distance: number }>> {
  // embed the question with the same model used at ingest time
  const [embedding] = await embedBatch([question]);
  const embeddingStr = JSON.stringify(embedding);

  const rows = await sql<{ source: string; content: string; distance: number }[]>`
    SELECT
      source,
      content,
      (embedding <=> ${embeddingStr}::vector) AS distance
    FROM documents
    ORDER BY embedding <=> ${embeddingStr}::vector
    LIMIT ${topK}
  `;

  return rows;
}

The <=> operator returns cosine distance (0 = identical, 2 = opposite). Lower numbers win. If you add metadata filters, add them in the WHERE clause before ORDER BY so the planner can use the HNSW iterative scan introduced in 0.8.0.

// filtered query example — same model must have returned results for this source
const rows = await sql<{ source: string; content: string; distance: number }[]>`
  SELECT source, content, (embedding <=> ${embeddingStr}::vector) AS distance
  FROM documents
  WHERE source = ${filterSource}
  ORDER BY embedding <=> ${embeddingStr}::vector
  LIMIT ${topK}
`;

5. Wiring retrieved docs into an LLM call

Concatenate the retrieved chunks into a context block, then call your model of choice. Claude 3.5 Sonnet or GPT-4o both handle long contexts well. Keep the context block under 80,000 tokens for cost reasons.

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });

export async function answerWithRAG(question: string): Promise<string> {
  const docs = await queryDocuments(question, 5);

  if (docs.length === 0) {
    return "No relevant documents found.";
  }

  const context = docs
    .map((d, i) => `[${i + 1}] (${d.source})n${d.content}`)
    .join("nn---nn");

  const prompt = `You are a helpful assistant. Answer the question using only the provided context.
If the context does not contain the answer, say so.

Context:
${context}

Question: ${question}`;

  const response = await anthropic.messages.create({
    model: "claude-sonnet-4-6-20250929",
    max_tokens: 1024,
    messages: [{ role: "user", content: prompt }],
  });

  const block = response.content[0];
  return block.type === "text" ? block.text : "";
}

The “answer using only the provided context” instruction is load-bearing. Without it, the model mixes retrieval with parametric memory and you cannot tell which is which. If the answer comes from the context, citations work. If it comes from training data, they do not. Force the distinction at the prompt level.

One more thing worth noting: rerank before you send to the LLM. A fast cosine search returns the 5 closest chunks by vector distance, but distance does not always equal usefulness. A cross-encoder reranker (Cohere Rerank costs about $1 per 1,000 queries) takes your top-20 candidates and scores them for actual relevance before you trim to 5. The quality jump is noticeable. Skip the reranker while prototyping, add it before you hit production.

6. Two gotchas that bite everyone

Chunk size drives recall more than index parameters

Most teams spend hours tuning HNSW m and ef_construction and see marginal gains. The actual lever is chunk size and overlap. A chunk that is too short loses context (the model cannot answer a cross-sentence question). A chunk that is too long pulls in noise, dilutes the embedding, and wastes context window in the LLM call. Run a quick eval: take 20 representative questions, retrieve top-5, then manually score whether the answer appeared in the returned chunks. Adjust chunk size in 100-word steps until recall tops 85%. Then tune the index.

Build the index after bulk loading, not before

HNSW indexing at insert time is slow. If you load 500,000 documents and the HNSW index exists, every INSERT pays the graph update cost. The fast path: load all rows with the index dropped, then build it once with CREATE INDEX. On a table of 500,000 rows with 1,536-dim embeddings, a cold HNSW build takes roughly 8 to 12 minutes on 4 vCPUs. That is far cheaper than the cumulative insert overhead.

-- drop the index before bulk load
DROP INDEX IF EXISTS documents_embedding_idx;

-- ... run your ingest pipeline ...

-- rebuild once after load
CREATE INDEX documents_embedding_idx
  ON documents USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

The bottom line

The full pipeline is about 120 lines of TypeScript and three SQL statements. pgvector 0.8.x is stable enough for production, HNSW is the right default index for most teams, and the two things that matter most for answer quality are chunk size and staying consistent between embed-at-ingest and embed-at-query time (same model, same preprocessing). Dedicated vector DBs are not wrong, they are just a layer you do not need until your row count passes 50M or your recall requirements get strict enough to warrant a tuning team.

What chunk size worked best for your use case? Drop it in the comments.

GDS K S · thegdsks.com · follow on X @thegdsks

Good retrieval beats a better model every time.

Same code, three clocks — letting a quant agent trade on its own without losing the audit trail


In the last post I argued that an LLM should never hold the approval token on a trade. A human approves. The model only proposes. That works as long as a human is in the loop on every order.

Then a user does the obvious thing. They take a strategy the agent wrote, like the backtest, and say “put it on the paper account.”

They expect it to trade: follow the market in, follow it out, update positions while they sleep.

The honest truth at that point: status = 'promoted' was a database flag. Nobody was ticking the strategy’s on_bar. The account didn’t move. That gap was the whole feature.

Closing it means the machine now places orders on live bars with no human clicking approve each time. Which sounds like exactly the thing the last post said not to do.

This post is how you close the gap without throwing away the audit trail. And the four places the trust boundary has to be redesigned the moment no human is in the chair.

The easy half: same code, three clocks

Inalpha holds one invariant tight: the Python file you backtest is the file you paper-trade. No fork for production. You swap two things underneath the strategy — the Clock and the Gateway — and the business logic doesn’t move.

The invariant itself isn’t rare. What’s rare is the thing standing on top of it here.

The author of that file is an LLM. It was vetted by a human. And it’s now running itself on live bars.

Quant engines hold the invariant, but don’t assume an agent wrote the strategy. Agent frameworks assume the LLM, but have nowhere to put a trading harness. Inalpha sits in that seam. And the same-code invariant is exactly what makes the audit chain mean anything: there’s precisely one file to point a signature at.

How it runs — three deployment modes, two clocks, one file:

  • Backtest: a TestClock driven by historical bars; fills simulated against a reference price.
  • Paper (live runner): a LiveClock on real wall-clock time, bars pulled fresh on the strategy’s timeframe, the same matching engine, the order routed out through the real plan/exec path — the only simulated part is that fills are matched locally instead of sent to a broker.
  • Live (real capital): architecturally the same seam — LiveClock, same kernel, same plan/exec path, only the Gateway swapping to a real broker. But real-money trading is deliberately out of scope for this project; holding the invariant isn’t about chasing it. The payoff is narrower and real: backtest and paper are literally one code path, so the audit chain has exactly one file to point a signature at.

So “three clocks” is shorthand: two clock implementations (TestClock / LiveClock), the third mode (real capital) a seam the architecture leaves open but the project doesn’t pursue — and the strategy file never notices which one it’s running under.

The live runner (services/paper/.../live_runner.py) is one long-lived task per running strategy. Each tick it does three things:

  1. pull the latest closed bar;
  2. feed it to a session that reuses the exact backtest kernel, firing the strategy’s on_bar;
  3. intercept the order the strategy emits and hand it to the guarded order path — it does not match locally.

When the fill comes back, it’s replayed into the session. So the strategy’s view of its own position stays consistent with what actually filled.

Why this matters for audit-grade, not just convenience: if your backtest and live code are two different files, no signature chain will tell you which one ran when the $93k order happened. Same code, three clocks is the precondition. It’s also the boring half. Here’s the half that kept me up.

The hard half: who approves the order?

Last post’s thesis was a three-step state machine. The LLM drives step one. A human drives the approval:

trade.create_plan       → plan: pending_approval
trade.approve_plan      → mints a single-use token
trade.execute_plan(tok) → places the order

A runner that trades while you sleep can’t stop and wait for a click on every bar. So the naive fix is to delete the approval step for the automated path. That’s the fix that quietly turns “audit-grade” back into “trust me.”

We did the opposite. The automated path goes through the same plan/exec state machine. The approval is just stamped approved_by = "system:live_runner".

Machine approval. The order still creates a plan. Still mints and consumes a single-use token. Still writes the same signed audit line. Nothing on the order path got a shortcut.

Machine approval is only honest if it’s earned. Ours rests on two human gates upstream, and the agent can’t route around either:

  1. A human promotes the candidate. promote is a deliberate human action, with permission: ask on the agent side. The model can’t self-promote a strategy into the runnable set.
  2. A human starts the run. paper.start_strategy is an explicit call a person makes for a specific market and timeframe.

So the chain reads: a person vetted this strategy, a person chose to run it here. Given those two signatures, having the machine approve each later order on live bars is the expected behavior, not a bypass. The audit line records system:live_runner as the approver for exactly this reason — a replay shows where the human gates were and where the machine took over.

Every order the runner places also writes a decision record (strategy_run_decisions): the bar context, the order intent, and the outcome (filled, rejected, or risk_rejected), cross-referenced to the plan and the trade.

The point of the autonomous path isn’t just that it trades. It’s that the next morning you can read, line by line, every bar where it wanted to act and what the harness did about it.

The trust boundary moves when the human leaves the chair

This isn’t a bug list. It’s four faces of one architectural question.

With a human in the loop, a lot of guarantees are propped up implicitly by “someone is at the screen.” Designing the unattended path means asking that again, on purpose: which of those props has to become something the system holds up on its own?

Four answers.

1. Identity has to become explicit.
When a human starts each run, ownership is implicit — whoever clicked owns it. Automate it, and ownership has to live in the data model, or there’s no boundary at all.

Concretely: the start path checked that a candidate was promoted, not that the caller owned it. So you could run someone else’s strategy on your own account.

The trap in fixing it was real. The candidate’s author_id is only set for UUID identities, while the account id falls back to uuid5 for everyone else. A naive author_id == account_id would lock out every non-UUID user. The fix derives an owner_account_id through the same function as the account id (migration 0013), so ownership is comparable for everyone.

2. Resource bounds are part of the trust boundary, not an ops detail.
A human starting runs self-limits. An API doesn’t. Each run is a long-lived task polling the data service on a timer, and the only limit was one instance per candidate — but a user can promote arbitrarily many. So the boundary grows a per-account cap (default 10) that returns 429, instead of letting one account quietly melt the event loop.

3. With no human, the default has to invert.
Fail-open is a default that assumes a backstop. Letting risk checks fail open in dev is fine when a human is at the screen.

The unattended runner is not at a screen. A risk engine that’s disabled or fails to load becomes an autonomous order loop with zero risk checks — the worst possible default. So on this path the default inverts: fail closed. No risk guard, no run, unless you explicitly opt out.

4. Backtest/live parity has to reach down to data shape.
A human wouldn’t trade a half-formed bar. The machine will, unless the architecture forbids it.

The latest bar each tick is often still forming — its close isn’t final. Acting on it silently diverges from the backtest, which only ever saw closed bars. So the runner decides only on closed bars, matching backtest semantics exactly.

(One implementation detail rides along. The loop treated every exception as retryable with backoff, so a determined-wrong error — a delisted symbol, a constraint violation — burned the whole retry budget before giving up. It now splits retryable from non-retryable and stops immediately on the latter. Plumbing, not architecture.)

Some of these I saw clearly only after an adversarial review of the shipped runner. But they aren’t scattered bugs. They’re four corollaries of one sentence: the trust boundary of an autonomous path is not the same boundary as one with a human in the loop.

What this still costs, and what we punted

Honesty section, same as last time.

  • The runner runs candidate code in the main event loop. The backtest path isolates strategy code in a resource-limited subprocess. The live session compiles and runs it inline. The AST audit is a static gate, not a runtime one — it won’t stop a pure-compute infinite loop from hanging the service. We lean on the two human gates to keep the code trusted. Subprocess/watchdog hardening is filed, not done.
  • A crash mid-fill can drift the in-memory position from the DB. The fill is committed to the DB first, then replayed into the session. If the process dies in between, a restart rebuilds the session from empty cash, not from the DB positions. “Resume a run from its real position” is the next robustness item, deliberately not faked in this release.
  • Single-instance only. Startup reconciliation marks every stranded running row as errored. Correct for one process, wrong the moment you run two. Multi-instance leasing is a Phase-F item, flagged in the code where it bites.

I’d rather ship the gates that are load-bearing now and name the ones that aren’t yet, than imply the autonomous path is hardened against things it isn’t.

So what

The cheap version of “let the agent trade for you” deletes the approval step and calls it autonomy.

The audit-grade version keeps the entire order path intact. It stamps the approver as the machine. It earns that stamp with two human gates the model can’t route around. Then it redesigns the trust boundary, so every guarantee the human used to backstop is one the system now holds on its own.

Autonomy isn’t the absence of the harness. It’s the harness running without you in the chair.

If this resonated:

  • 📬 Subscribe to Inalpha on Substack — one long-form post a month, ADRs and post-mortems, no algorithm between us and you
  • github.com/mirror29/inalpha — the live runner, the plan/exec path, and the four boundary changes above are all in services/paper
  • 👉 Next post: Sandboxed strategy evolution — three gates + multi-objective fitness. What happens when you actually let the LLM mutate trading code, and what catches it when it shouldn’t have. (Yes, the one I promised last time — it’s next.)

What an OpenAI-Compatible API Router Should Actually Do

An OpenAI-compatible API router should not make your stack more complicated. If it does, it has already failed.

The whole point of compatibility is boring simplicity:

One base URL.

One API key.

Same general SDK shape.

That gives you room to improve the economics without rewriting the application.

For AI coding workflows, this matters because the tool in front is often already good enough. The pain is underneath: cost, provider management, usage logs, and routing.

The minimum useful setup should look familiar:

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://incat.ai/v1",
  apiKey: process.env.OPENAI_API_KEY,
});

If a router requires a large rewrite before you can test it, most developers will not bother. They are right.

The first test should be small:

  • one workflow
  • one API key
  • one prepaid balance
  • one cost comparison

What should the router do?

Route by task

Send routine work to cheaper capable models. Keep risky work on stronger models.

Preserve logs

Developers need to know which workflow burns money.

Avoid surprise bills

Prepaid credits are useful because they turn runaway usage into a visible constraint.

Keep escape hatches

If a cheaper route is not good enough, switch back. Routing should create options, not lock-in.

That is the category I want inCat to live in.

Not another AI coding app.

Not a model museum.

An OpenAI-compatible API router for developers who want the same workflow to cost less.

Generate a config:

https://incat.ai/codex-config-generator.html

Finishing a Read-Only MCP Server: From 6 Tools to 9

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

I took an unfinished open-source MCP server for DEV.to and added the missing half.

The original repo (nickytonline/dev-to-mcp) was built by an actual DEV.to engineer and shipped six read-only tools: get_articles, get_article, get_user, get_tags, get_comments, search_articles. Useful for reading, useless for writing.

I extended it with three write tools:

  • create_article for publishing new articles (draft or live)
  • update_article for editing existing ones
  • delete_article for unpublishing

The result is a full read-write MCP server that lets Claude (or any MCP client) treat DEV.to like a CMS. This article was created and published using it.

Demo

The tool list in Claude Desktop after the build:

Read-only tools (6):
  Get Articles, Get Article, Get User, Get Tags, Get Comments, Search Articles

Write/delete tools (3):
  Create Article, Update Article, Delete Article

A draft creation call looks like this:

{
  "tool": "create_article",
  "args": {
    "title": "My new post",
    "body_markdown": "# Hello world",
    "tags": ["webdev", "ai"],
    "published": false
  }
}

The MCP server hits POST https://dev.to/api/articles with the user’s DEVTO_API_KEY from env, returns the article ID, and Claude can immediately call update_article against it. No browser, no copy-paste from chat to editor.

The Journey

The original repo was solid but limited. I asked myself: why use an MCP server that can only read?

Setup was the first wall. The npm package wasn’t published, so npx -y @nickytonline/dev-to-mcp returned 404. Then npm install -g github:... failed because the repo had no top-level package.json at the install path npm expected. The fix was unglamorous: git clone, npm install, npm run build, point Claude Desktop’s config at the local dist/index.js.

There was also a Windows-specific gotcha. Claude Desktop on Windows needs npx.cmd, not npx. The error message was just Server disconnected. Logs showed bad option: -y because the config still had the npx flag while the command had been swapped to node. Small things, two hours.

Once the read-only server was running, the actual finish-up work was straightforward. The codebase used a clean handler pattern: each tool was a function that called the DEV.to API and returned a typed response. I followed the same pattern for the three new tools:

// Pattern from the existing read tools
async function getArticle(id: number) {
  const res = await fetch(`https://dev.to/api/articles/${id}`, {
    headers: { 'api-key': process.env.DEVTO_API_KEY }
  });
  return res.json();
}

// New write tool, same shape
async function createArticle(article: ArticleInput) {
  const res = await fetch('https://dev.to/api/articles', {
    method: 'POST',
    headers: {
      'api-key': process.env.DEVTO_API_KEY,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({ article })
  });
  return res.json();
}

Register the new handlers in the MCP server’s tool list, rebuild with npm run build, restart Claude Desktop. Done.

Tech Stack

  • TypeScript for the server code
  • Vite for the build (12.83 kB output, builds in 133ms)
  • Model Context Protocol SDK for the server scaffolding
  • DEV.to API v1 as the backend
  • Claude Desktop as the MCP client

What I Learned

Two things stood out.

First, finishing someone else’s project is faster than starting from scratch. The repo had types, patterns, error handling, and tests already in place. Adding three tools meant matching an existing shape, not inventing one. The full add-rebuild-test cycle was under thirty minutes.

Second, AI assistance works best when there’s existing structure to imitate. I gave Claude Code the repo path and asked for three tools matching the existing pattern. It read the codebase, identified the handler signature, knew the DEV.to API endpoints from training data, and produced working code on the first build. Without the existing read-only tools as reference, the output would have needed more iteration.

What’s Next

The MCP server still has gaps:

  • No image upload (DEV.to requires base64 inline or external URLs)
  • No get_followers or get_following
  • No comment write/delete
  • No analytics endpoints

These are small additions following the same pattern. The hard part was the first three.

Repo

The forked repo with write tools: github.com/glatinone/dev-to-mcp

Original credit to @nickytonline for the read-only foundation.

How Excel is used in Real World Data Analysis

I’ve always known Excel as a tool for creating tables and performing simple calculations. However, after spending a week learning its fundamentals, I now understand why Excel remains one of the most widely used tools in data analysis.
Microsoft Excel is a spreadsheet application that allows users to collect, organize, clean, analyze, calculate, and visualize data. Its user-friendly interface and powerful features make it a valuable tool for individuals and organizations across different industries.
One way Excel is used in real-world data analysis is in business decision-making. Companies collect large amounts of data on sales, customers, and operations. Analysts use Excel to sort and filter this data, helping managers identify trends, monitor performance, and make informed decisions. For example, a retail business can sort products by sales volume to identify its best-selling items.
Excel is also widely used in financial reporting. Businesses use it to track expenses, calculate profits, prepare budgets, and generate financial reports. With formulas and formatting tools, financial data can be organized in a way that is easy to understand and analyze.
Another common application is marketing performance analysis. Marketing teams collect data from campaigns, websites, and social media platforms. Excel can be used to analyze campaign results, compare performance metrics, and identify which strategies are generating the best outcomes.
Throughout this week, I learned several Excel features and formulas that are useful in data analysis. The first is filtering, which allows analysts to display only the data that meets specific criteria. This is useful when working with large datasets and looking for particular information. I also learned about data validation, which helps maintain data quality by restricting the type of information users can enter into cells. This reduces errors and improves data accuracy.
In addition, I learned functions such as SUM(), AVERAGE(), and COUNT(). SUM() helps calculate totals, AVERAGE() finds the mean value of a dataset, and COUNT() determines how many numerical values exist within a range. These functions make it easier to summarize and understand data quickly. I also found text functions such as TRIM() and PROPER() useful for cleaning and standardizing data before analysis.
Learning Excel has changed the way I see data. Before, I saw data as a collection of numbers and text. Now, I see it as information that can tell a story and support decision-making when properly organized and analyzed. Excel has shown me that effective data analysis begins with understanding how to clean, structure, and explore data. As I continue my journey in data science, I can already see how these foundational Excel skills will support my learning of more advanced tools and techniques.

AI Agent Safety Need Stop Signs, Not Just Instructions

AI agents do not only need better instructions.

They need stop signs.

That is one of the clearest reasons Ota exists as software execution governance for humans and AI agents. A repo should not merely tell an agent what it can try. It should declare what the agent must not do, when it must stop, and what requires human approval.

Prompts and AGENTS.md files are useful. They give agents context: how the project is organized, what style to follow, how to summarize changes, and which areas need caution.

But advice is not a boundary.

An instruction says:

Be careful with database commands.

A stop sign says:

Do not run destructive database commands unless explicitly approved.

An instruction says:

Avoid editing generated files.

A stop sign says:

These paths are protected. Stop if the requested edit falls outside the writable boundary.

That difference matters because modern agents are no longer passive readers. They inspect repos, choose commands, edit files, run checks, interpret failures, and report completion.

If the repo gives them only guidance, they still have to infer the boundary.

Ota’s position is sharper: agent execution should not depend on inference. It should be governed by the repo.

Instructions tell agents what to attempt

Most agent guidance is written as advice.

It says:

  • follow the existing style
  • prefer small changes
  • run tests before finishing
  • avoid touching generated files
  • do not expose secrets
  • explain what changed

That helps. It makes agents less generic and more aware of the repo they are working inside.

But it still leaves the dangerous questions open.

Which tests should the agent run?
Which commands are allowed?
Which files are generated?
Which services require approval?
Which failures mean “fix the code” and which mean “stop and ask”?
Which paths are out of bounds?

A capable agent may make reasonable guesses.

But reasonable guesses are not governance.

For low-risk editing, guidance may be enough. For repo execution, CI, automation, and agentic development, the repo needs something stronger.

Stop signs define when not to continue

A stop sign is not a suggestion.

It is a boundary.

In a repo, stopping rules should cover at least five areas.

1. Secrets and credentials

An agent should not invent secrets, request private values indirectly, or edit sensitive environment files just to make a task pass.

If a command needs an API key, database password, cloud token, or private credential, the correct behavior is not improvisation.

The correct behavior is to stop and report the blocker.

2. External services

Some tasks depend on systems outside the repo: cloud infrastructure, managed databases, payment providers, queues, object storage, or production-like services.

If those services are unavailable, the agent should not patch code around the failure.

It should identify the missing dependency and stop.

3. Unsafe mutation

Some commands change state.

deploy
publish
db:reset
terraform apply

These are not cousins of test, lint, or build.

If a task can mutate external state, delete data, publish packages, or affect infrastructure, the repo should not outsource that decision to the agent’s confidence.

That boundary should be declared.

4. Protected paths

Agents need to know where they can work.

Source files and tests may be open. Generated files, migrations, lockfiles, production config, and environment files may need review or approval.

This is not about slowing the agent down.

It is about preventing quiet damage in files that carry operational weight.

5. Verification limits

Agents also need to know when verification is finite.

A long-running dev server is not a verification result.
A watch mode is not a handoff signal.
A task that never terminates is not the same as a bounded check.

Agent-safe tasks need finite verification paths: run, finish, report status.

Without that, the agent may wait indefinitely, stop too early, or report success without a meaningful result.

This is execution governance

This is bigger than prompt quality.

If an agent runs a risky command, edits a protected file, or treats missing credentials as a code problem, the issue is not only that the agent made a poor choice.

The repo failed to govern execution.

Software execution governance means the repo can declare:

  • what it needs
  • how it should be prepared
  • what can be executed
  • what requires approval
  • where agents can write
  • when verification is complete
  • when execution must stop

That is the frame Ota is built around.

Not “better setup docs.”

Not “another task runner.”

Ota is the contract-first way to make execution boundaries explicit for humans, CI, automation, and AI agents.

How Ota makes stop signs explicit

In an Ota-backed repo, stopping rules do not have to live only in prose.

The contract can declare safe tasks, verification tasks, writable paths, protected paths, setup requirements, and readiness blockers.

That gives agents a governed operating model:

If the task is declared safe, proceed.
If setup is required, prepare from the contract.
If the contract is invalid, stop.
If secrets or credentials are missing, stop.
If the requested edit is outside writable paths, stop.
If the task mutates external state without approval, stop.
If verification is complete, report the result.

That is stronger than telling an agent to “be careful.”

Ota’s agent quickstart follows this same principle: agents should prefer repo-local contracts when they exist, execute declared safe tasks, parse JSON output instead of scraping terminal prose, and stop when blockers involve secrets, credentials, external services, unsafe mutation, or paths outside declared boundaries.

The command surface supports that model:

  • ota doctor checks readiness and surfaces blockers before work begins.
  • ota validate checks whether the contract itself is usable.
  • ota tasks shows what work the repo has declared.
  • ota up --dry-run previews setup before changing the environment.
  • ota run <task> --json runs declared work and returns stable status for automation.

The point is not that every agent action needs ceremony.

The point is that dangerous ambiguity should be removed before execution happens.

AGENTS.md still matters

This does not make AGENTS.md useless.

It means AGENTS.md should do what prose does best: explain context.

Use it for style, conventions, architectural notes, review expectations, and collaboration preferences.

Use Ota for the execution boundary.

A clean split looks like this:

AGENTS.md:
How the agent should behave.

ota.yaml:
What the repo allows, requires, verifies, and refuses.

One gives the agent context.

The other governs the repo.

Together, they produce a better operator: one that understands the project and knows where the guardrails are.

Stop signs build trust

Teams do not trust agents because agents sound confident.

They trust agents when the repo constrains what the agent can do, makes the approved path obvious, and produces evidence for what happened.

A good stop sign does not make agents less useful.

It makes them dependable.

It tells the agent:

Move quickly here.
Slow down here.
Stop here.
Ask here.
Report this.
Do not guess.

That is the behavior serious teams need as AI agents move from code suggestion into repo execution.

Conclusion

AI agents need instructions.

But instructions alone are not enough.

A repo that only tells agents what to do still leaves too much room for unsafe interpretation. The next layer is stopping rules: clear boundaries for secrets, external services, unsafe mutation, protected paths, and finite verification.

That is why Ota’s contract-first model matters.

It turns agent safety from advice into execution governance.

The future of AI-assisted development will not be won by repos that merely prompt agents better.

It will be won by repos that know when agents should stop.

  • Explore the Ota getting started guide
  • Check out the Ota examples repo

Originally posted @ ota.run

How the Internet Actually Works – Networking, DNS, Architecture & My DMI DevOps Journey

Week 0: How the Internet Actually Works – Networking, DNS, Architecture & My DevOps Journey Begins

I recently joined the DevOps Micro Internship (DMI) – Cohort 3, a free, project-based program by Pravin Mishra at CloudAdvisory. Before we dive into the exciting parts – containers, CI/CD pipelines, Kubernetes, cloud platforms – the program correctly insists on mastering the foundational concepts first.

This post documents everything I worked through in Week 0, covering five core tasks and my honest reflections. If you are starting your DevOps journey, this post is for you too. Consider this a beginner-friendly technical reference, not just a journal entry.

Why Foundations Matter in DevOps

It is tempting to jump straight into Docker or AWS. I get it – the tools look cool, the job postings mention them everywhere, and YouTube tutorials make them seem approachable. But here is the uncomfortable truth: tools break, documentation changes, and architectures evolve. What does not change nearly as fast is the underlying fundamentals.

A DevOps engineer who understands how data actually travels across a network, why DNS exists, and how application layers are separated will debug production incidents faster, design more resilient systems, and adapt to new tools with far less friction. That is the mindset behind Week 0.

Let’s get into it.

Task 1 – Exploring Concepts with AI: Networking Protocols

The first task involved using ChatGPT to explore networking protocols from first principles. The goal was not just to get an answer, but to learn how to ask precise questions and synthesise the response into a genuine understanding.

What Are Networking Protocols?

A networking protocol is a standardised set of rules that governs how data is transmitted between devices on a network. Without protocols, two devices attempting to communicate would be like two people trying to have a conversation, one speaking English and the other French, with no shared framework.

Protocols define:

  • Format: What does a valid message look like?
  • Sequencing: Who speaks first? Who speaks next?
  • Error handling: What happens when something goes wrong?
  • Termination: How does the conversation end cleanly?

Think of it like road traffic laws. The laws do not build the roads, but they ensure that everyone using the roads does so in a predictable, safe, and efficient manner. Without them, even a perfectly built road would result in chaos.

Key Insight from This Task

What struck me most was how protocols operate in layers. No single protocol handles everything. Instead, a stack of protocols each handles a specific concern, and together they make the internet function. This layered thinking – breaking a complex problem into isolated, composable responsibilities – is also a core principle in software architecture and DevOps. I would encounter it again and again as the week progressed.

Task 2 – Internet & Networking Fundamentals

This task required me to explain four foundational concepts in my own words. Here is my in-depth take on each.

Packet Switching

When you send a message, a file, or a video stream across the internet, that data is not sent as one giant, continuous stream. Instead, it is broken into small chunks called packets. Each packet contains a piece of the actual data (the payload), plus metadata – the source address, destination address, sequence number, and error-checking information.

These packets do not all travel the same route. Routers across the internet evaluate network conditions in real time and forward each packet along the most efficient path available at that moment. At the destination, the packets are reassembled in the correct order.

Why does this matter? Packet switching is what makes the internet resilient. If one network link fails, packets are simply rerouted. No single point of failure can take down the entire communication. This is a fundamentally different (and superior) model to the old circuit-switched telephone network, where a dedicated line had to remain open for the entire duration of a call.

The DevOps connection: When you are debugging network latency or packet loss in a distributed system, understanding packet switching tells you why packets arrive out of order, why retransmission happens, and where to look when something is slow.

IP Address

An IP (Internet Protocol) address is a numerical label assigned to every device on a network. It serves two core purposes: host identification (which device is this?) and location addressing (where is this device on the network?).

There are two versions currently in use:

Version Format Example Address Space
IPv4 32-bit, four octets 192.168.1.1 ~4.3 billion addresses
IPv6 128-bit, eight groups 2001:0db8:85a3::8a2e:0370:7334 ~340 undecillion addresses

The world ran out of IPv4 addresses years ago. Techniques like NAT (Network Address Translation) have extended IPv4’s lifespan by allowing multiple devices on a private network to share a single public IP, but IPv6 adoption is the long-term solution.

The DevOps connection: You will work with IP addresses constantly – assigning them to servers, configuring security group rules, setting up load balancers, and troubleshooting connectivity. Understanding the difference between public and private IPs, and how subnetting works, is essential for cloud networking on AWS, GCP, or Azure.

TCP/IP

TCP/IP is not one protocol but a suite of protocols. The two most important are:

IP (Internet Protocol) – handles addressing and routing. It is responsible for getting packets from a source to a destination, but it is connectionless and does not guarantee delivery or order.

TCP (Transmission Control Protocol) – adds reliability on top of IP. Before any data is sent, TCP performs a three-way handshake:

  1. SYN: The client sends a synchronise packet to the server.
  2. SYN-ACK: The server acknowledges and sends its own synchronise.
  3. ACK: The client acknowledges the server’s response.

A connection is now established. TCP then ensures every packet is received, requests retransmission of any lost packets, and delivers data to the application layer in the correct order.

UDP (User Datagram Protocol) is the alternative – connectionless, no handshake, no guaranteed delivery. It is faster, which makes it ideal for video streaming, gaming, and DNS lookups where a dropped packet is less catastrophic than a delay.

The DevOps connection: When you configure a load balancer, you choose between TCP and HTTP (which runs on top of TCP). When you write a Dockerfile exposing a port, you specify TCP or UDP. Understanding this layer is the difference between configuring things by guessing and configuring them with confidence.

HTTP and HTTPS

HTTP (HyperText Transfer Protocol) is the application-layer protocol used to transfer web pages, APIs, and other resources over the internet. It operates on a simple request-response model:

  1. A client (browser, API consumer, CLI tool) sends an HTTP request with a method (GET, POST, PUT, DELETE), headers, and optionally a body.
  2. A server returns an HTTP response with a status code, headers, and optionally a body.

HTTPS (HTTP Secure) wraps HTTP inside TLS (Transport Layer Security), which provides:

  • Encryption: Data in transit cannot be read by third parties (man-in-the-middle attacks are thwarted).
  • Authentication: The server’s identity is verified via a certificate signed by a trusted Certificate Authority (CA).
  • Integrity: Data cannot be tampered with in transit without detection.

The analogy I find most intuitive: HTTP is like sending a postcard – anyone handling it can read what it says. HTTPS is like sending a letter in a tamper-proof, locked box. Only the intended recipient has the key.

The DevOps connection: You will configure TLS certificates using tools like Let’s Encrypt and Cert-Manager. You will set up HTTPS on Nginx or a cloud load balancer. You will debug SSL handshake failures and certificate expiry alerts. Knowing what HTTPS actually does – not just that it “adds a padlock” – makes all of this manageable.

Task 3 – Application Architecture: Two-Tier vs. Three-Tier

Modern applications are not monolithic blobs of code. They are organised into architectural tiers – logical layers that separate concerns, enable independent scaling, and support team-based development. Understanding these tiers is critical for anyone working in DevOps, because you need to know what you are deploying, where each component lives, and how the layers communicate.

Two-Tier Architecture

In a two-tier (client-server) architecture, the application is split into exactly two layers:

┌─────────────────────┐
│    CLIENT TIER      │  ← Presentation + Business Logic
│ (Browser / Desktop) │
└──────────┬──────────┘
           │ Direct DB queries
           ▼
┌─────────────────────┐
│   DATABASE TIER     │  ← Data Storage
│ (MySQL / PostgreSQL)│
└─────────────────────┘

When it works well: Small internal tools, desktop applications with a limited number of users, and rapid prototyping. The simplicity means less infrastructure to manage.

Where it breaks down: The client handles both the UI and business logic. This means every client must be updated when business rules change. It also means clients often have direct database access, which is a serious security concern at scale.

Technologies typically involved:

Tier Examples
Client HTML/CSS, React, Angular, Desktop apps
Database MySQL, PostgreSQL, SQLite

Three-Tier Architecture

Three-tier architecture introduces a dedicated middle layer – the application server (or backend) – between the client and the database.

┌─────────────────────┐
│   PRESENTATION TIER │  ← UI only
│  (Browser / Mobile) │
└──────────┬──────────┘
           │ HTTP/HTTPS requests
           ▼
┌─────────────────────┐
│   APPLICATION TIER  │  ← Business Logic & APIs
│ (Node.js / Django)  │
└──────────┬──────────┘
           │ Parameterised queries
           ▼
┌─────────────────────┐
│     DATA TIER       │  ← Persistent Storage
│ (PostgreSQL/MongoDB)│
└─────────────────────┘

Why this matters:

  • Security: No client ever touches the database directly. The backend validates and sanitises all input before any query is executed.
  • Scalability: Each tier can be scaled independently. If your API is the bottleneck, you spin up more backend instances without touching the frontend or the database.
  • Maintainability: Business logic lives in one place. Change a rule in the backend, and all clients – web, mobile, CLI – immediately reflect that change.
  • Team autonomy: Frontend engineers, backend engineers, and DBAs can work in parallel without constantly stepping on each other.

Technologies typically involved:

Tier Examples
Frontend HTML, CSS, JavaScript, React, Angular, Vue
Backend Node.js, Express.js, Django, Spring Boot, FastAPI
Database MySQL, PostgreSQL, MongoDB, Redis

The DevOps connection: When you write a Kubernetes deployment, you are typically deploying each tier as a separate service with its own pods, resource limits, health checks, and scaling policies. When you design a CI/CD pipeline, you often have separate pipelines for the frontend and backend. When you configure a database, you write network policies that allow only the backend service to connect. Three-tier thinking is baked into modern infrastructure.

Task 4 – Domain Name System (DNS) Deep Dive

DNS is one of those technologies that most people take for granted – until it breaks. When DNS goes down, the internet, from a user’s perspective, ceases to work. Understanding how it works is not optional for a DevOps engineer.

What is DNS?

DNS stands for Domain Name System. Its primary job is to translate human-readable domain names (like epicreads.com) into machine-readable IP addresses (like 52.172.142.222).

Without DNS, you would need to memorise the IP address of every website you want to visit. DNS is the phonebook of the internet.

How DNS Resolution Works (Step by Step)

When you type epicreads.com into your browser and hit Enter, here is what actually happens:

Browser → OS Cache → Recursive Resolver → Root Nameserver
       → TLD Nameserver (.com) → Authoritative Nameserver
       → Returns IP → Browser connects to 52.172.142.222
  1. Browser cache: The browser checks its own cache. Did it look up this domain recently?
  2. OS cache: If not, the operating system checks its own DNS cache (/etc/hosts on Linux, the Windows DNS Client service).
  3. Recursive resolver: If still not found, the query goes to your ISP’s (or a public) recursive resolver, such as 8.8.8.8 (Google) or 1.1.1.1 (Cloudflare). This resolver does the heavy lifting on your behalf.
  4. Root nameservers: The resolver asks a root nameserver. There are 13 sets of root nameservers globally. They do not know the IP of epicreads.com, but they know who is authoritative for .com domains.
  5. TLD nameservers: The .com nameserver knows which nameserver is authoritative for epicreads.com.
  6. Authoritative nameserver: This is the nameserver managed by the domain’s owner (e.g., via AWS Route 53 or Cloudflare). It returns the definitive answer: the IP address associated with epicreads.com.
  7. Response travels back: The IP is cached at multiple levels (with a TTL – Time to Live – that controls how long it stays cached) and returned to the browser.

DNS Record Types

Record Type Purpose Example
A Maps a domain to an IPv4 address epicreads.com → 52.172.142.222
AAAA Maps a domain to an IPv6 address epicreads.com → 2001:db8::1
CNAME Alias – maps one domain to another www.epicreads.com → epicreads.com
MX Mail exchange – specifies mail servers epicreads.com → mail.google.com
TXT Arbitrary text – used for SPF, DKIM, domain verification v=spf1 include:_spf.google.com ~all
NS Nameserver – delegates a domain to specific DNS servers epicreads.com → ns1.cloudflare.com
SOA Start of Authority – metadata about the DNS zone Includes serial number, refresh intervals

Connecting epicreads.com to 52.172.142.222

To map the domain epicreads.com to the IP address 52.172.142.222, you create an A Record in the domain’s DNS zone:

epicreads.com.   300   IN   A   52.172.142.222
  • epicreads.com. – the hostname (the trailing dot indicates the DNS root)
  • 300 – the TTL in seconds (5 minutes); after this time, cached records expire
  • IN – Internet class
  • A – record type (IPv4 address mapping)
  • 52.172.142.222 – the destination IP address

Why not a CNAME? A CNAME maps a name to another name, not to an IP address. CNAMEs also cannot be used at the zone apex (the root domain, e.g., epicreads.com itself) – they can only be used on subdomains. So www.epicreads.com could be a CNAME pointing to epicreads.com, but epicreads.com itself must use an A record.

The DevOps connection: You will configure DNS records constantly – pointing domains to load balancers, configuring subdomains for different services, setting up MX records for transactional email, and adding TXT records to verify domain ownership for SSL certificates. Understanding TTL is critical too: if you set a TTL of 86400 (24 hours) and need to change an IP urgently, you will be waiting a very long time for the change to propagate globally.

Task 5 – Development Environment Setup: Visual Studio Code

A professional development environment is not a luxury – it is the foundation on which all your work is built. I set up Visual Studio Code (VS Code) as my primary editor for this internship.

Why VS Code for DevOps?

VS Code has become the de facto standard for DevOps engineers for several reasons:

  • Language support: From Python and Go to Bash and YAML, VS Code handles everything through its extension marketplace.
  • Integrated terminal: You can run commands without switching windows, which becomes enormously productive over time.
  • Git integration: Built-in source control panel with diff views, staging, committing, and branching.
  • Extension ecosystem: Thousands of extensions for Docker, Kubernetes, Terraform, AWS, Azure, and more.
  • Remote development: The Remote – SSH and Dev Containers extensions allow you to develop directly on remote servers or inside containers, which is invaluable for DevOps workflows.

Key Extensions I Installed

Extension Purpose
HashiCorp Terraform Syntax highlighting, autocompletion for .tf files
Docker Manage containers and images directly from VS Code
Kubernetes Interact with clusters, view pods and logs
YAML Linting and schema validation for Kubernetes manifests, CI/CD configs
GitLens Enhanced Git history, blame annotations, and branch visualisation
Prettier Code formatting for JavaScript, JSON, HTML, CSS
Remote – SSH Develop on remote Linux servers as if they were local

The Broader Toolchain

VS Code is just the editor. A complete DevOps development environment also includes:

  • Git – version control (non-negotiable for every project)
  • A terminal – WSL2 on Windows, or the built-in terminal on macOS/Linux
  • Node.js / Python – scripting and automation
  • Docker Desktop – container runtime for local development
  • A cloud CLI – AWS CLI, Azure CLI, or gcloud, depending on your target platform

Getting comfortable with these tools before working on live infrastructure is essential. Mistakes in a local environment are free. Mistakes in production are expensive.

Reflection: Week 0 in Honest Review

What I Found Easy

The networking and DNS sections came naturally to me. These concepts map closely to everyday experiences – browsing websites, using email, navigating apps – so the mental models were already partially in place. I found that once you have the right analogy (packets as parcels, DNS as a phonebook, HTTPS as a locked envelope), the technical details click into place quickly.

What Was Difficult

Application architecture – specifically the distinction between two-tier and three-tier designs – required more effort than I anticipated. The concepts sound simple in isolation, but understanding the implications of each architectural decision takes deeper thinking. Why does moving business logic from the client to a dedicated application server change everything about scalability, security, and maintainability? The answer requires holding multiple concerns in mind simultaneously.

I also found that the most challenging part was not understanding what the layers are, but understanding why the separation exists and what goes wrong when it is violated. Reading about real-world examples – monolithic applications that became impossible to scale, data breaches caused by direct client-to-database access – made the architectural principles feel concrete rather than academic.

What I Will Improve Next Week

Hands-on practice with real tools. Reading and writing about networking is valuable, but there is a qualitative difference between understanding how DNS works conceptually and actually configuring a DNS zone, watching propagation happen, and debugging a misconfigured record. My goal for Week 1 is to close the gap between theoretical knowledge and practical muscle memory.

Specifically, I plan to:

  • Practice Linux command-line navigation and file management
  • Work through basic shell scripting exercises
  • Explore cloud console interfaces (starting with AWS)
  • Revisit application architecture by building a minimal three-tier app locally

Key Takeaways

If you have read this far, here is a summary of the most important concepts from Week 0:

  1. Networking protocols are layered. No single protocol handles everything. Understanding the layers prevents tunnel vision when debugging.
  2. Packet switching is what makes the internet resilient. Data takes multiple paths; failures are routed around automatically.
  3. HTTPS is not just about the padlock. It provides encryption, authentication, and integrity – three distinct security guarantees.
  4. Three-tier architecture is the baseline for modern applications. Separation of concerns enables independent scaling, improved security, and team autonomy.
  5. DNS is the phonebook of the internet, and A records map domain names to IPv4 addresses. TTL controls how long these mappings are cached globally.
  6. Your development environment is infrastructure. Set it up thoughtfully, version-control your configurations, and keep it consistent.

If you are following along or if you are on a similar DevOps learning path, feel free to connect in the comments. I would love to hear what foundational concepts you found most challenging – or which ones surprised you the most.

This post is part of my public learning journey through the DevOps Micro Internship (DMI) – Cohort 3 by Pravin Mishra at CloudAdvisory. All tasks completed in this programme are documented openly on this blog.

About DevOps Micro Internship (DMI) & CloudAdvisory
DevOps Micro Internship (DMI) is a free, project-based DevOps learning program by Pravin Mishra (CloudAdvisory). It helps students, job-seekers, and working professionals gain real-world DevOps skills through weekly assignments, projects, and community support.

🌐 DMI Official Website: https://pravinmishra.com/dmi

🎓 DevOps for Beginners: Docker, K8s, Cloud, CI/CD & 4 Projects (Udemy): https://www.udemy.com/course/devops-for-beginners-docker-k8s-cloud-cicd-4-projects/?referralCode=C5BA8236CCE9FE004F98

▶️ DevOps for Beginners – YouTube Playlist: https://www.youtube.com/playlist?list=PLVOdqXbCs7bX88JeUZmK4fKTq2hJ5VS89

🔗 Follow Pravin Mishra on LinkedIn: https://www.linkedin.com/in/pravin-mishra-aws-trainer/