Why Object-Oriented Programming Was Introduced – Objects and Classes

In the previous article, we examined the importance of software design, and how all software engineering principles have been defined to address issues rather than creating more complexity.

In this article, we will begin with one of the most significant programming paradigms known as Object Oriented Programming (OOP).

But before moving into definitions, let us examine the issue first.

🀯 The Problem

Imagine you are building a user system.

Without OOP:

const user1Name = "Ashay";
const user1Email = "ashay@gmail.com";

function loginUser1() {}
function logoutUser1() {}

Now imagine:

  • 10 users
  • 100 users
  • admins
  • customers
  • sellers

Everything becomes:

  • duplicated
  • disconnected
  • hard to manage

You have:

  • data scattered everywhere
  • behavior scattered everywhere

This is the core problem OOP tries to solve.

🧠 The Core Idea of OOP

OOP says:

“A real-world entity should keep its own data and behavior together.”

Example:

A User should contain:

  • its own properties (name, email)
  • its own capabilities (login, logout)

That combination becomes an object.

✨ What is an Object REALLY?

An object is:

A self-contained unit of state + behavior.

State = data
Behavior = actions/functions

Example:

const user = {
  name: "Ashay",
  email: "ashay@gmail.com",

  login() {
    console.log(`${this.name} logged in`);
  }
};

Here:

Part Meaning
name/email state
login() behavior
combined together object

This is the true meaning of an object.

Then Why Do We Need Classes?

Now imagine creating 10,000 users.

You DON’T want:

const user1 = { ... }
const user2 = { ... }
const user3 = { ... }

You need a reusable structure.
That reusable structure is a class.

What is a Class REALLY?

A class is:

A factory/template that defines what an object should contain.

Example:

class User {
  name: string;
  email: string;

  constructor(name: string, email: string) {
    this.name = name;
    this.email = email;
  }

  login() {
    console.log(`${this.name} logged in`);
  }
}

Now you can create objects easily:

const user1 = new User("Ashay","a@gmail.com");
const user2 = new User("Rahul","r@gmail.com");

Deep Understanding of new

This is VERY important.

When you do:

const user1 = new User(...)

new does several things internally:

  1. Creates empty object {}
  2. Connects object to class prototype
  3. Binds this to new object
  4. Runs the constructor
  5. Returns object

πŸ’‘ Important Mental Model

A class is NOT the actual thing.

It is only:

  • definition
  • structure
  • contract

The object is the real runtime entity.

Example:

Real World OOP
Building map Class
Actual house Object
Human DNA structure Class
Actual person Object

Common Beginner Mistake

People think:

class User {}

means β€œdoing OOP”.

No.

Real OOP begins when you think:

  • What responsibility belongs here?
  • What data should this object own?
  • What should be hidden?
  • How should objects communicate?
  • Who controls what?

That leads to:

  • Encapsulation
  • Abstraction
  • Polymorphism
  • Dependency Injection
  • SOLID principles

which we’ll cover one by one.

⏭️ What’s Next?

Now that we understand what OOP is and how classes and objects help organize state and behavior, an interesting question naturally arises especially for JavaScript developers.

If JavaScript already allows us to create objects using plain object literals and factory functions, why do we need classes at all?

Are classes simply syntactic sugar, or do they solve a different set of problems?

Before we dive into the core principles of OOP such as Encapsulation, Abstraction, Inheritance, and Polymorphism, we’ll take a small detour to explore one of the most common debates in the JavaScript ecosystem:

Factory Functions vs Classes

We’ll compare both approaches, understand their trade-offs, and discuss when each one makes sense in real-world applications.

Because before learning how to design good objects, it’s worth understanding the different ways we can create them.

I Wrote 10 AI Stories in 10 Days. My Keyboard Started Smoking on Day 4.

Biggest thing I learned writing the AI, Ego & Regret series: I argue with myself way more than I thought.

Every post goes through the same loop:

10 PM: “This story’s fire. Gonna blow up tomorrow.”

1 AM: “Wait β€” did I make it clear that 450ms wasn’t just a random number?” β†’ Scrolls back to check. Yes. OK. Move on.

Next morning: “What was I thinking? Scrap it. Rewrite from scratch.”

The cover images were the worst part. One article went through 6 different backgrounds before circling back to the first one. 45 minutes I’ll never get back.

Then there’s that one line: “It was right about yesterday β€” and yesterday wasn’t running anymore.” Rewrote it 11 times. My wife walked by and said, “I thought you were writing code, not poetry.”

Ben Halpern hit me with a 5-reaction combo while I was eating instant noodles. Almost choked.

Waking up at 3 AM. First instinct: check comments. Nothing. Go back to sleep. 5 minutes later: check again. Still nothing.

Writing code? I’m normal. Writing stories? I’m the guy verifying his own made-up RabbitMQ number at 1 AM.

Would I do it again? Yeah. Probably. But I’d get a better keyboard this time.

This coffee’s about to run out β€” and I’m not done typing yet. If these stories made you smile, chuckle, or roll your eyes, buy me a coffee and keep the keys smoking β˜•πŸ”₯

Also β€” if you’ve got a story that’s been sitting in your head, something that made you laugh, cringe, or question every life decision that led to that moment β€” send it over. I’ll turn it into a story. Yours could be the next one.

No pressure. Just a keyboard that’s already warm.

Building AI agents with Vercel AI SDK

The Vercel AI SDK treats agents as tool-calling loops: the model generates text or invokes tools, the SDK runs those tools, and the loop continues until the model answers or a stop condition fires.

This post builds a support triage agent that looks up customers and invoices, searches an internal knowledge base, and either opens a ticket or escalates to a human. It builds on the LLM integration with Vercel AI SDK post and focuses on multiple tools, stopWhen, and stepCountIs.

For external tools exposed over MCP instead of SDK-native tool() handlers, see the MCP server with Node.js post.

Prerequisites

  • OpenAI account
  • Generated API key
  • Enabled billing
  • Node.js version 26
  • ai, @ai-sdk/openai, and zod installed (npm i ai @ai-sdk/openai zod)
  • Client setup from the Vercel AI SDK integration post

Mental model – steps and the tool loop

A step is one model generation. In that step the model either:

  • returns text (the loop ends), or
  • returns tool calls (the SDK executes them and starts another step with the results)

Typical flow for the support triage agent: user question β†’ model calls lookup tools (getCustomer, getInvoice, searchKnowledgeBase) β†’ model creates a ticket or escalates β†’ final answer. stopWhen can end the loop before or after the write tools run.

stepCountIs(5) means “stop after 5 steps” (five model generations), not five individual tool calls. A single step can include multiple parallel tool calls.

When you pass tools without stopWhen, the SDK defaults to stepCountIs(20) as a safety cap.

Support triage scenario

Example prompt:

Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?

A realistic chain:

  1. getCustomer – plan tier, open ticket count
  2. getInvoice – amount, status, payment IDs
  3. searchKnowledgeBase – duplicate-charge and refund policy
  4. createSupportTicket or escalateToHuman – write action or sentinel stop

The demo uses in-memory fixtures (customers, invoices, knowledge-base articles) so scripts run without a database.

Defining multiple tools

Register tools with tool() and Zod inputSchema. Clear description values help the model pick the right tool.

import { tool } from 'ai';
import { z } from 'zod';

const getCustomer = tool({
  description: 'Look up a customer account by ID',
  inputSchema: z.object({
    customerId: z.string().describe('Customer ID, e.g. cus_1042'),
  }),
  execute: async ({ customerId }) => {
    const customer = customers.find((item) => item.id === customerId);
    if (!customer) {
      return { found: false, customerId, error: 'Customer not found' };
    }
    return { found: true, customer };
  },
});

const getInvoice = tool({
  description: 'Look up an invoice by ID, including payment IDs and status',
  inputSchema: z.object({
    invoiceId: z.string().describe('Invoice ID, e.g. inv_8891'),
  }),
  execute: async ({ invoiceId }) => {
    const invoice = invoices.find((item) => item.id === invoiceId);
    if (!invoice) {
      return { found: false, invoiceId, error: 'Invoice not found' };
    }
    return { found: true, invoice };
  },
});

const searchKnowledgeBase = tool({
  description: 'Search internal support articles by keyword',
  inputSchema: z.object({
    query: z.string().describe('Search terms, e.g. duplicate charge refund'),
  }),
  execute: async ({ query }) => {
    // keyword match against mocked articles
    return { query, articles: matches };
  },
});

Add write tools for outcomes:

const createSupportTicket = tool({
  description: 'Create a support ticket after gathering customer and policy context',
  inputSchema: z.object({
    customerId: z.string(),
    subject: z.string().min(3),
    priority: z.enum(['low', 'medium', 'high']),
    summary: z.string().min(10),
  }),
  execute: async (input) => {
    const ticket = createTicket(input);
    return { created: true, ticket };
  },
});

const escalateToHuman = tool({
  description: 'Escalate when policy requires manual review',
  inputSchema: z.object({
    customerId: z.string(),
    reason: z.string().min(10),
    urgency: z.enum(['normal', 'high']),
  }),
  execute: async (input) => ({
    escalated: true,
    queue: input.urgency === 'high' ? 'billing-urgent' : 'billing-standard',
    ...input,
  }),
});

Return structured objects from execute. The SDK serializes them as tool results for the next step. Return explicit errors (for example { found: false, error: '...' }) so the model can recover instead of throwing.

Multi-step triage with generateText

Pass all tools and a system prompt with triage rules:

import { generateText, stepCountIs } from 'ai';

const { text, steps } = await generateText({
  model: openai('gpt-5.5'),
  system: `You are a billing support triage agent.
- Look up customer and invoice before recommending refunds.
- Search the knowledge base for policy guidance.
- Create a ticket when you can resolve within policy.
- Call escalateToHuman when manual review is required.`,
  tools: {
    getCustomer,
    getInvoice,
    searchKnowledgeBase,
    createSupportTicket,
    escalateToHuman,
  },
  stopWhen: stepCountIs(8),
  prompt:
    'Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?',
});

console.log('steps:', steps.length);
console.log(text);

Use a model that supports tool calling (same requirement as web search in the Vercel AI SDK post).

stopWhen – when the loop stops

stopWhen defines stopping conditions for the tool loop. Conditions are evaluated only when the last step contains tool results.

  • A single condition stops when that condition returns true
  • An array stops when any condition returns true (OR logic)
  • Without stopWhen, the SDK applies stepCountIs(20)

The loop also ends naturally when the model returns text without further tool calls.

stepCountIs – cap the number of steps

stepCountIs(n) stops once steps.length reaches n. Use it on every production agent to prevent runaway loops and unbounded API cost.

Use case Suggested cap
Single tool, then answer 2 (tool step + text step)
Chat with occasional tool use 3-5
Task agents (triage, research) 8-15
Long autonomous workflows 15-20 (with monitoring)

Tight vs relaxed cap on the same prompt:

import { generateText, stepCountIs } from 'ai';

// Stops after 3 steps even if the model still wants more context
const capped = await generateText({
  model: openai('gpt-5.5'),
  tools: supportTools,
  stopWhen: stepCountIs(3),
  prompt: '...',
});

// Allows a fuller investigation chain
const relaxed = await generateText({
  model: openai('gpt-5.5'),
  tools: supportTools,
  stopWhen: stepCountIs(8),
  prompt: '...',
});

Combining hasToolCall with stepCountIs

hasToolCall('toolName') stops when the model invokes a specific tool in the latest step. Pair it with stepCountIs for a hard cap plus a sentinel tool:

import { generateText, stepCountIs, hasToolCall } from 'ai';

const { text, steps } = await generateText({
  model: openai('gpt-5.5'),
  system: TRIAGE_INSTRUCTIONS,
  tools: supportTools,
  stopWhen: [stepCountIs(10), hasToolCall('escalateToHuman')],
  prompt:
    'Customer cus_2201 on the starter plan reports a duplicate $190 charge on invoice inv_9104.',
});

escalateToHuman works well as a sentinel: the loop stops as soon as the model decides the case needs a human, without waiting for a final text-only step.

Inspecting steps and usage

The steps array on the result contains per-step tool calls, tool results, finish reason, and usage. Use it for debugging and cost tracking:

const { text, steps, totalUsage } = await generateText({
  model: openai('gpt-5.5'),
  tools: supportTools,
  stopWhen: stepCountIs(8),
  prompt: '...',
});

for (const [index, step] of steps.entries()) {
  console.log(`step ${index + 1}`);
  console.log('  toolCalls:', step.toolCalls?.map((c) => c.toolName));
  console.log('  usage:', step.usage);
}

console.log('totalUsage:', totalUsage);

With streamText, pass onStepFinish to log each step as it completes.

ToolLoopAgent – reusable agent definition

ToolLoopAgent wraps the same loop for reuse across scripts and API routes. It accepts the same settings as generateText (tools, stopWhen, instructions).

import { ToolLoopAgent, stepCountIs } from 'ai';

const supportTriageAgent = new ToolLoopAgent({
  model: openai('gpt-5.5'),
  instructions: TRIAGE_INSTRUCTIONS,
  tools: supportTools,
  stopWhen: stepCountIs(8),
});

const result = await supportTriageAgent.generate({
  prompt:
    'Customer cus_1042 says they were charged twice for invoice inv_8891. What should we do?',
  onStepFinish: async ({ stepNumber, usage, toolCalls }) => {
    console.log(`step ${stepNumber + 1}`, {
      tokens: usage.totalTokens,
      tools: toolCalls?.map((call) => call.toolName),
    });
  },
});

console.log(result.text);

Use .stream() for streaming. For Next.js chat UIs, see createAgentUIStreamResponse in the AI SDK agents docs.

Streaming with tools

streamText supports the same tools and stopWhen settings:

import { streamText, stepCountIs } from 'ai';

const result = streamText({
  model: openai('gpt-5.5'),
  system: TRIAGE_INSTRUCTIONS,
  tools: supportTools,
  stopWhen: stepCountIs(8),
  prompt: 'Customer cus_1042 says they were charged twice for invoice inv_8891.',
  onStepFinish: async ({ stepNumber, toolCalls }) => {
    console.error(`step ${stepNumber + 1}:`, toolCalls?.map((c) => c.toolName));
  },
});

for await (const part of result.textStream) {
  process.stdout.write(part);
}

Text streams incrementally. Tool calls run between text segments as the loop progresses.

Production notes

  • Always set stopWhen – do not rely on the default stepCountIs(20) in production without monitoring
  • Cost – each step is another model call; log steps or onStepFinish usage
  • Tool errors – return structured errors from execute instead of throwing when the model should retry or escalate
  • Instructions – keep policy rules in system / instructions, not only in the user prompt
  • Same patterns elsewhere – PR review (listPRs β†’ getChecks β†’ submitReview) or job-fit scoring use the same loop mechanics with different tools

Demo

Runnable scripts for each section live in the vercel-ai-sdk-agents-demo folder. Get access via code demos.

The Onboarding Math: What Each New Hire Actually Costs When Your AI Stack Is Fragmented

Here is a number most finance teams have never calculated: the total cost of getting one new employee to full productivity in a fragmented digital environment.

Not salary. Not benefits. Not the recruiting fee. The cost of the time between their start date and the date they can operate independently at full output β€” including the time of every colleague who helps them get there.

For companies with consolidated, well-designed digital environments, this number runs 6 to 10 weeks of fully-loaded compensation. For companies with fragmented stacks of 8 to 12 tools that don’t integrate cleanly, it runs 14 to 20 weeks.

That gap β€” 4 to 10 weeks of productive capacity per hire β€” is one of the most expensive invisible costs in enterprise operations. And it scales directly with headcount growth.

The Calculation Most Companies Skip

The standard onboarding cost model looks like this: recruiter fee plus first-month salary plus benefits plus laptop plus software licenses. A mid-market company bringing on a $120,000 salary employee calculates something in the range of $15,000 to $20,000 in onboarding costs.

The actual cost model should look like this.

Assume a fully-loaded cost of $85 per hour for the new employee. Assume a productivity ramp that reaches 50% effectiveness at week 4 and 80% effectiveness at week 10, reaching full productivity at week 14 for a company with a moderately complex tool stack.

The opportunity cost of the ramp period β€” the output not produced compared to a fully productive employee β€” runs approximately $18,000 to $24,000 per hire at this salary level, before accounting for any manager or colleague time invested.

Now add the colleague time. A new employee in a 10-tool environment asks more questions, requires more hand-holding on where things live, and pulls more time from more experienced colleagues than a new employee in a 4-tool environment. Conservative estimate: 8 hours of senior colleague time in week one, declining to 2 hours per week by month two. Total: 40 to 50 hours of experienced-employee time per new hire.

At $85 to $120 per hour for the colleagues involved, that’s $3,400 to $6,000 in real cost that shows up nowhere in the onboarding budget.

For a company hiring 30 people per year at this salary level, the aggregate fragmentation tax on onboarding runs $650,000 to $900,000 annually. Not as a line item. As a diffuse, invisible drag on productivity that never surfaces in any report.

What Creates the Fragmentation Tax

The root cause is straightforward. Every tool in the stack is a thing a new employee must learn. Not just the interface β€” the conventions, the permissions model, where different types of content live, which channel is authoritative for which decisions, how the tool integrates with the other tools they’re also learning simultaneously.

A 4-tool environment has roughly 6 integration relationships to understand (each tool to each other tool). A 10-tool environment has 45. The cognitive load of navigating those relationships, before the new employee can focus on their actual job, is substantial.

The fragmentation tax compounds for roles that require cross-functional visibility. A project manager who needs to pull status from 6 different systems, synthesize it, and report upward is performing coordination labor that serves no other output. That labor is invisible because it’s embedded in their job description β€” but it’s directly caused by the tool architecture and would be reduced or eliminated by consolidation.

The Onboarding Efficiency Benchmark

A useful benchmark: what is your time-to-independent-operation for a new hire in a standard individual contributor role?

Not time to first task completion. Time to independent operation β€” the point at which the employee can navigate all required systems, locate information without asking, complete their core workflows without assistance, and contribute to cross-functional projects without needing to be oriented by colleagues.

For companies that have measured this with any precision, the results correlate strongly with tool stack complexity. Sub-8-week time-to-independence is achievable with consolidated environments. 16-plus weeks is typical for highly fragmented environments.

If you don’t know your current number, it’s worth measuring. Ask managers from three departments: how long until a new hire in this role can operate fully independently? The variance in the answers will tell you something. The averages will tell you something else.

Where This Intersects with AI Tools

The emergence of AI tools in enterprise environments adds a new dimension to this calculation.

In a consolidated AI environment β€” where the AI is integrated into the workspace teams already use β€” new employees learn the AI as part of learning the environment. The AI has access to the same context the employee is learning to navigate.

In a fragmented AI environment β€” where the AI is a separate tool that must be connected to other tools, prompted correctly for each use case, and maintained as another system in an already complex stack β€” new employees face an additional learning curve layered on top of an already demanding onboarding.

Worse, AI tools in fragmented environments often underperform for new employees specifically, because the AI lacks the organizational context that makes it useful. An AI that can’t access the CRM, the project management system, and the communication history simultaneously can’t help a new employee understand the state of a customer relationship or a project the way an integrated AI can.

The onboarding math changes when the AI has full context. The time-to-independent-operation shortens not because the tools are simpler but because the AI can answer the questions that would otherwise require a colleague’s time.

The Annual Cost of Not Fixing This

Take your current annual hiring volume. Multiply by the average fully-loaded salary of new hires. Apply a ramp efficiency factor based on your estimated time-to-independence. Add colleague time costs.

That number is the annual cost of your current fragmentation β€” not the total cost of fragmentation, but the portion attributable to onboarding alone.

For most mid-market companies hiring 20 to 50 people per year, this calculation produces a number large enough to justify a serious consolidation investment. The tools exist. The math usually makes the case clearly once someone does it.

The question is whether anyone has been asked to do it.

I built a self-hosted log search tool for my team

The backstory

Some time ago I adopted Quickwit at my company. For anyone who hasn’t used it: Quickwit is a search engine that runs full-text search directly on object storage (S3 or anything S3-compatible). It decouples compute from storage, so you don’t pay to keep big indexes warm to search older data. That model fits logs well.

It worked well for us, but there was a gap. Quickwit is excellent at the search engine part and leaves the rest to you by design: no end-user experience around it, and little access control. We were missing what a team needs day to day, like a usable UI, authentication, and gated ingest.

I started building that layer myself. It turned into Rootprint.

What it is

Besides the basics you’d expect for working with logs (controlling the view, seeing context around a log line, a filters panel, a histogram, severity-aware views), it adds:

  • User management with Google and GitHub authentication
  • Authenticated endpoints to ingest data
  • Cluster stats so you can see what’s going on
  • A user activity overview
  • Control over sources, and more

It runs on your own infrastructure, and it’s Apache-2.0 licensed.

The stack

I wanted it light and fast, so:

  • Backend: Hono running on Bun
  • Frontend: Svelte 5 + SvelteKit, with Tailwind and DaisyUI

How it plugs in

Rootprint connects to any Quickwit instance through an environment variable. One caveat: it needs Quickwit 0.9+.

To get started, grab the Docker Compose file and run it:

curl -o docker-compose.yml https://docs.rootprint.io/files/docker-compose.full.yaml
docker compose up -d

The docs cover the rest of the installation options: docs.rootprint.io/install/docker-compose.

Where it’s going

I want to build a platform for logs and traces that holds up against anything else out there. Right now I’m focused on the log search experience; traces are on the roadmap but not built yet.

It’s still early and pre-1.0, so expect breaking changes between releases.

I’d love your feedback

If this sounds useful, I’d appreciate you trying it out and telling me what’s broken, confusing, or missing. Feedback and contributors are welcome.

  • Source code: github.com/rootprint/rootprint
  • Docs: docs.rootprint.io

Thanks for reading.

Rider 2026.2 EAP 5: Code Quality Checks for Your AI Agents, and More.

Rider 2026.2 EAP 5 is now available, bringing a faster startup flow with the new non-modal Welcome screen and quality-check hooks for AI agents.

If you’re catching up on the 2026.2 EAP cycle, be sure to check out the blog posts we’ve already published about other updates unveiled so far, including WPF Hot Reload, the finding-tests skill for AI-assisted test generation, and the earlier EAP builds.

Download Rider 2026.2 EAP 5

Quality-check hooks for Claude Code and Codex

Rider 2026.2 EAP 5 introduces bundled quality-check hooks for external AI agents, starting with Claude Code and Codex. In agent workflows, a hook is an automated step that runs at a specific point in the agent’s process. Here, Rider uses a PostToolUse hook: after an agent edits a file, Rider automatically runs IDE-level validation before the agent continues.

This means agent-generated code is no longer just accepted as-is. These checks can detect code issues identified by Rider’s built-in analysis and inspections, as well as formatting inconsistencies.

Agent-generated code with and without Rider’s quality-check feedback.
Watch Rider hooks catch potential errors and redirect the agent.

Errors can block the agent from treating the task as complete, while warnings and other findings are returned as feedback the agent can use to fix its own output. The result is a tighter AI-assisted development loop where the IDE, not the agent, sets the quality bar.

Easier access to Explain with AI

The Explain with AI action is now easier to discover when you need it most: while dealing with build errors and runtime exceptions. Instead of copying diagnostics into chat or manually describing what went wrong, you can trigger an AI explanation directly from the place where Rider surfaces the problem.

For .NET developers, this is especially useful because build output often combines Roslyn diagnostics, analyzer warnings, MSBuild issues, NuGet restore problems, and multi-targeting failures. Explain with AI helps turn noisy or context-dependent errors into a clearer explanation with likely causes and next steps, so you can move from failure to fix faster.

Share your thoughts

That’s it for Rider 2026.2 EAP 5. Download the latest EAP build, try the new features for AI-assisted development, and let us know how they work in your projects.

Download Rider 2026.2 EAP 5

SoloEngine: How to Let AI Run Every Industry

As someone with three years of experience in large language model algorithms, agent development, and knowledge base construction, I’ve recently had a thought: Vibe Coding has emerged in the programming industry simply because programmers know how to write code. Other industries don’t have Cursor or Claude Code, not because they lack the need for Agentic AI, but because they don’t use LangChain or CrewAI. I wanted to build a tool that lowers the barrier to Agentic AI development to the same simplicity as workflow tools like Dify. Thus, SoloEngine was born.

SoloEngine, as the first low‑code Agentic AI development platform, fully encapsulates mechanisms such as ReAct, Tool, MCP, Skill, and SubAgent into backend services. When using it, you simply drag an agent onto the canvas, connect collaboration relationships, configure the required tools, and click Run. The backend then automatically compiles everything into your very own Claude Code β€” planning, execution, and delivery are all autonomously completed by the agent.

Comparison: SoloEngine vs Other Solutions

Feature Dify, n8n, Zapier LangChain, CrewAI, LangGraph SoloEngine
Agentic AI βœ— Scripted workflows only βœ“ ReAct / Multi‑Agent βœ“ ReAct / Multi‑Agent
No coding required βœ“ βœ— Python mandatory βœ“
Visual orchestration Partial support βœ— βœ“ Full canvas experience
Domain experts can build independently βœ“ (but workflows are not truly Agentic) βœ— βœ“
Multi‑agent collaboration βœ— βœ“ βœ“

Core Design

For compilation efficiency, all agent nodes adopt a unified ReAct architecture. The platform parses superior‑subordinate relationships through topology, enabling connections and SubAgent calls. The visual design on the canvas is directly compiled into an executable agent team.

At runtime, each agent employs progressive disclosure, loading only the MCPs and Skills it needs on demand β€” token consumption can be reduced by over 85%.

On the model side, SoloEngine covers commonly used AI models such as OpenAI, Anthropic, Ollama, DeepSeek, Qwen, and Zhipu β€” a unified interface for seamless switching.

Release Updates

After more than a dozen development iterations, the v0.2 file change tracking and rollback mechanism has been released and is relatively stable. An official release build will be available soon. v0.3‘s one‑click deployment feature for Agentic AI is in its final stages, allowing compiled agent teams to be packaged as standalone products for self‑deployment or distribution and sales. Meanwhile, long‑term memory and autonomous evolution are also on the roadmap.

Quick Start

git clone https://github.com/Sh4r1ock/SoloEngine.git
cd SoloEngine

# Backend (Python 3.11+)
cd backend
pip install -r requirements.txt
python main.py

# Frontend (Node.js 18+) β€” run in another terminal
cd frontend 
npm install
npm run dev

Open http://localhost:8991 to build your first agent team.

Get Involved

The project is currently in a phase of rapid iteration. More participants are welcome to help AI drive every industry. We hope that in the future, AI will evolve from Vibe Coding into Vibe Everything.

Project repository: https://github.com/Sh4r1ock/SoloEngine

TLS Fingerprinting: How JA3 and JA4 Identify You Before You Send a Byte

Encryption hides the contents of your HTTPS connection β€” but the negotiation that sets up that encryption happens in the clear. The very first message your client sends, before a single byte of application data, has a distinctive shape. JA3 and JA4 turn that shape into a fingerprint that can identify your software, and sometimes route, throttle, or block you on the spot.

Every HTTPS connection starts with a TLS handshake, and the handshake starts with a message called the ClientHello. It is sent unencrypted, because the two sides have not yet agreed on a key. Inside it, your client announces everything it is willing to do: which TLS versions it supports, which cipher suites it prefers and in what order, which extensions it understands, which elliptic curves and signature algorithms it offers.

None of that is secret. None of it has to be. But taken together, the exact set and ordering of those parameters is remarkably specific to a particular piece of software at a particular version. Chrome 124 produces a different ClientHello from Firefox, which produces a different one from Python’s requests library, which differs from Go’s standard library, which differs from a curl built against a specific OpenSSL version. TLS fingerprinting is the practice of hashing that ClientHello into a short, stable identifier and looking it up.

What Goes Into the Fingerprint

The original technique, JA3, was published by three engineers at Salesforce in 2017 β€” John Althouse, Jeff Atkinson, and Josh Atkins, whose initials gave it the name. JA3 builds a string from five fields of the ClientHello, in order:

  • The TLS version offered
  • The list of cipher suites
  • The list of extensions
  • The list of supported elliptic curves (named groups)
  • The list of elliptic-curve point formats

Each field is rendered as its numeric values joined by hyphens, the fields are joined by commas, and the whole string is hashed with MD5 to produce a 32-character fingerprint. A companion technique, JA3S, does the same for the server’s ServerHello, so you can fingerprint both ends of a conversation. Pairing a client JA3 with a server JA3S is a common way to identify specific malware command-and-control channels, because the malware and its server both produce consistent, unusual hashes.

Why ordering matters: Two clients can support the exact same cipher suites and still fingerprint differently, because they offer them in a different preference order. That ordering is baked into the TLS library and rarely changes between builds β€” which is exactly what makes it a stable signal.

Why JA3 Started to Break

JA3 worked well for years, but two developments eroded it. The first was GREASE (RFC 8701), a mechanism Google introduced to keep the TLS ecosystem flexible. GREASE makes clients insert random reserved values into their cipher and extension lists, so that middleboxes don’t hard-code assumptions about what they see. The side effect is that a naive JA3 implementation produces a different hash on every connection unless it explicitly strips the GREASE values out.

The second was TLS 1.3 and the rise of extension shuffling. Chrome began randomizing the order of some ClientHello extensions on each connection specifically to discourage fingerprinting and ossification. Against a technique that depends on extension ordering, that is fatal: the same browser now yields many different JA3 hashes.

JA4: The Redesign

In 2023, John Althouse β€” one of the original JA3 authors, now at FoxIO β€” released JA4, the centerpiece of a broader suite called JA4+ that fingerprints not just TLS but HTTP, TCP, SSH, and more. JA4 was designed to survive the things that broke JA3.

The biggest structural change is that JA4 is partly human-readable. Instead of one opaque MD5, a JA4 fingerprint is divided into sections you can read at a glance:

  • A prefix describing the transport and TLS version, whether SNI is present, the count of cipher suites, the count of extensions, and the first ALPN value β€” for example, whether the client is speaking HTTP/2 or HTTP/1.1
  • A truncated hash of the cipher suites, sorted numerically so that order-shuffling no longer changes the result
  • A truncated hash of the extensions and signature algorithms, also handled so that cosmetic reordering doesn’t matter

GREASE values are stripped by definition. Because the cipher and extension lists are sorted before hashing, Chrome’s randomization no longer produces a moving target. The result is a fingerprint that is both more stable than JA3 and more informative, because a human analyst can read meaningful structure out of the prefix without consulting a lookup table.

Property JA3 (2017) JA4 (2023)
Output Single MD5 hash Structured, partly human-readable
Handles GREASE Only if implementation strips it Yes, by design
Survives extension shuffling No β€” order-dependent Yes β€” lists are sorted
Scope TLS ClientHello / ServerHello TLS, HTTP, TCP, SSH and more (JA4+)

Who Uses This, and For What

TLS fingerprinting is genuinely dual-use. On the defensive side, it is one of the more useful tools a network operator has. A fingerprint that claims to be Chrome in its User-Agent header but whose ClientHello matches Python’s requests is almost certainly a bot lying about itself. Security teams use JA3/JA4 to spot malware beaconing, to cluster automated traffic, and to flag scrapers that don’t match any real browser. Because the fingerprint is computed from bytes the client cannot easily fake without rebuilding its TLS stack, it is harder to spoof than a header.

That same strength is what makes it a censorship and tracking tool. A national firewall or a corporate middlebox can fingerprint every outbound connection and treat traffic differently based on what software produced it β€” throttling or blocking a circumvention tool whose handshake doesn’t look like a mainstream browser, even though it cannot read the encrypted payload. Anti-bot vendors and CDNs fingerprint connections to decide who gets served and who gets a challenge. The fingerprint becomes a passive selector applied before you have proven anything about who you are.

The encryption is doing its job perfectly. The leak is in the envelope, not the letter β€” and the envelope is, by necessity, written in the clear.

Can You Defend Against It?

Not cleanly, and that is the uncomfortable part. Because the fingerprint is derived from how your TLS library behaves, the only thorough defense is to make your traffic produce a common, unremarkable fingerprint β€” to look like everyone else. Circumvention tools increasingly do exactly this through uTLS, a Go library that lets a client mimic the precise ClientHello of a mainstream browser, GREASE and ordering included, so its JA3/JA4 blends into the crowd.

For an ordinary user, the practical reality is simpler: using a current, mainstream browser is itself a form of crowd-blending, because millions of others produce a near-identical handshake. The danger zone is unusual software β€” a custom client, an old library, a niche tool β€” that produces a rare fingerprint precisely because few others share it. This is the same logic that governs browser fingerprinting at the application layer: distinctiveness is the vulnerability, and the anonymity set is the defense.

The Broader Lesson

TLS fingerprinting is a clean illustration of a pattern that runs through nearly all privacy engineering: encrypting the contents of a channel does not hide the channel’s metadata, and the metadata is often enough. The handshake has to be in the clear so two strangers can agree on a key. The shape of that handshake leaks the identity of the software making it. No amount of payload encryption closes that gap, because the gap exists before encryption begins.

The honest takeaway is not that TLS is broken β€” it isn’t β€” but that “the connection is encrypted” answers a narrower question than most people think. Knowing what your tools reveal in the clear, and choosing tools whose visible behavior is common rather than distinctive, is the part of the threat model that fingerprinting forces you to take seriously.

Originally published at havenmessenger.com

RAG with Postgres pgvector in 2026: the full TypeScript pipeline.

RAG with Postgres pgvector in 2026: the full TypeScript pipeline.

I spent a week evaluating dedicated vector databases before deciding to just use the Postgres instance I already had. The pgvector extension handles similarity search well enough for most production workloads, and it collapses three infrastructure components into one. This walkthrough covers everything from schema to answer: chunk your docs, embed them, store in pgvector, retrieve by cosine similarity, and wire the results into an LLM call.

TL;DR

Step Tool Why
Enable vector store pgvector 0.8.x, HNSW index Runs in your existing Postgres, no extra infra
Embed text-embedding-3-small (1,536 dims) $0.02 per million tokens, fast
Query <=> cosine distance, top-k Works with both OpenAI and Voyage models
Augment Claude or GPT-4o with retrieved docs Context window stuffed, hallucination rate drops

1. Why pgvector instead of a dedicated vector database

Pinecone and Weaviate are good products. If you need multi-tenant isolation, sub-millisecond p99 at 100M+ vectors, or native hybrid search with BM25, they earn their place. For most teams, those are future problems.

The cost calculus changes when you consider ops burden. A dedicated vector DB means a new billing line, a new set of credentials to rotate, a new failure mode to track, and a new SDK to keep current in your application. pgvector runs as a Postgres extension: one connection string, one backup strategy, one source of truth. At 10M documents with 1,536-dimensional embeddings, an HNSW index on a reasonably sized Postgres instance returns top-10 results in under 10ms. That covers the overwhelming share of RAG use cases.

pgvector 0.8.0 added iterative HNSW scans. That release made filtered similarity search practical without falling back to sequential scans every time a WHERE clause got specific. The 0.8.0 release was what tipped my team from “maybe later” to “ship it.”

2. Schema setup

Enable the extension once per database, then create your table.

-- enable pgvector (run once per database)
CREATE EXTENSION IF NOT EXISTS vector;

-- documents table
CREATE TABLE documents (
  id         BIGSERIAL PRIMARY KEY,
  source     TEXT NOT NULL,          -- filename, URL, or ID of source doc
  chunk_idx  INT NOT NULL,           -- chunk number within the source
  content    TEXT NOT NULL,          -- raw text of the chunk
  embedding  vector(1536) NOT NULL,  -- OpenAI text-embedding-3-small
  created_at TIMESTAMPTZ DEFAULT NOW()
);

Choosing between HNSW and IVFFlat

HNSW builds a navigable small-world graph. Queries scan the graph instead of comparing all rows. Build once, query immediately. The tradeoff is that the index takes more memory: roughly 8 bytes per dimension per row for a 1,536-dim column at default settings.

IVFFlat partitions the embedding space into centroid clusters. Faster to build, smaller memory footprint, but you must load rows before building the index or the centroid assignment is useless. If you are starting from zero rows, build HNSW.

-- HNSW index (recommended default)
-- m = connections per layer (default 16), higher = better recall at higher memory cost
-- ef_construction = candidate list during build (default 64), higher = better recall at slower build
CREATE INDEX ON documents
  USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

-- IVFFlat alternative (only after loading rows)
-- lists = sqrt(row_count) is a good starting point for large tables
-- CREATE INDEX ON documents USING ivfflat (embedding vector_l2_ops) WITH (lists = 100);

Use vector_cosine_ops with the <=> operator when your embedding model normalizes vectors (OpenAI and Voyage both do). Use vector_l2_ops with <-> for raw Euclidean distance when vectors are not normalized. Use vector_ip_ops with <#> for inner product, which equals cosine similarity on normalized vectors and saves one normalization step.

3. Ingest pipeline in TypeScript

The ingest function chunks a document, calls the embedding API, and bulk inserts rows. Use postgres (the npm package, not pg) for its tagged-template SQL and native array support.

import postgres from "postgres";
import OpenAI from "openai";

const sql = postgres(process.env.DATABASE_URL!);
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });

const CHUNK_SIZE = 512;   // tokens, not characters
const CHUNK_OVERLAP = 64; // tokens of overlap between adjacent chunks

function chunkText(text: string, size: number, overlap: number): string[] {
  // naive word-boundary chunker β€” swap for tiktoken in production
  const words = text.split(/s+/);
  const chunks: string[] = [];
  let start = 0;
  while (start < words.length) {
    const end = Math.min(start + size, words.length);
    chunks.push(words.slice(start, end).join(" "));
    start += size - overlap;
  }
  return chunks;
}

async function embedBatch(texts: string[]): Promise<number[][]> {
  const response = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: texts,
  });
  return response.data.map((d) => d.embedding);
}

export async function ingestDocument(source: string, text: string): Promise<void> {
  const chunks = chunkText(text, CHUNK_SIZE, CHUNK_OVERLAP);

  // embed in batches of 100 (OpenAI max batch size)
  const BATCH = 100;
  for (let i = 0; i < chunks.length; i += BATCH) {
    const batch = chunks.slice(i, i + BATCH);
    const embeddings = await embedBatch(batch);

    const rows = batch.map((content, j) => ({
      source,
      chunk_idx: i + j,
      content,
      embedding: JSON.stringify(embeddings[j]),
    }));

    await sql`
      INSERT INTO documents (source, chunk_idx, content, embedding)
      SELECT
        r.source,
        r.chunk_idx::int,
        r.content,
        r.embedding::vector
      FROM jsonb_to_recordset(${JSON.stringify(rows)}::jsonb)
        AS r(source text, chunk_idx text, content text, embedding text)
    `;
  }

  console.log(`[ingest] ${source}: ${chunks.length} chunks stored`);
}

A note on chunk size: 512 words is a starting point. The right size depends on your source material. Legal documents with dense paragraphs do better at 256 words. Code files need at least 300 lines or you lose function context. The overlap prevents the embedding from missing a sentence that straddles a chunk boundary.

4. Query pipeline in TypeScript

Embed the user’s question, run a top-k cosine similarity search, return the matching chunks.

export async function queryDocuments(
  question: string,
  topK = 5,
): Promise<Array<{ source: string; content: string; distance: number }>> {
  // embed the question with the same model used at ingest time
  const [embedding] = await embedBatch([question]);
  const embeddingStr = JSON.stringify(embedding);

  const rows = await sql<{ source: string; content: string; distance: number }[]>`
    SELECT
      source,
      content,
      (embedding <=> ${embeddingStr}::vector) AS distance
    FROM documents
    ORDER BY embedding <=> ${embeddingStr}::vector
    LIMIT ${topK}
  `;

  return rows;
}

The <=> operator returns cosine distance (0 = identical, 2 = opposite). Lower numbers win. If you add metadata filters, add them in the WHERE clause before ORDER BY so the planner can use the HNSW iterative scan introduced in 0.8.0.

// filtered query example β€” same model must have returned results for this source
const rows = await sql<{ source: string; content: string; distance: number }[]>`
  SELECT source, content, (embedding <=> ${embeddingStr}::vector) AS distance
  FROM documents
  WHERE source = ${filterSource}
  ORDER BY embedding <=> ${embeddingStr}::vector
  LIMIT ${topK}
`;

5. Wiring retrieved docs into an LLM call

Concatenate the retrieved chunks into a context block, then call your model of choice. Claude 3.5 Sonnet or GPT-4o both handle long contexts well. Keep the context block under 80,000 tokens for cost reasons.

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });

export async function answerWithRAG(question: string): Promise<string> {
  const docs = await queryDocuments(question, 5);

  if (docs.length === 0) {
    return "No relevant documents found.";
  }

  const context = docs
    .map((d, i) => `[${i + 1}] (${d.source})n${d.content}`)
    .join("nn---nn");

  const prompt = `You are a helpful assistant. Answer the question using only the provided context.
If the context does not contain the answer, say so.

Context:
${context}

Question: ${question}`;

  const response = await anthropic.messages.create({
    model: "claude-sonnet-4-6-20250929",
    max_tokens: 1024,
    messages: [{ role: "user", content: prompt }],
  });

  const block = response.content[0];
  return block.type === "text" ? block.text : "";
}

The “answer using only the provided context” instruction is load-bearing. Without it, the model mixes retrieval with parametric memory and you cannot tell which is which. If the answer comes from the context, citations work. If it comes from training data, they do not. Force the distinction at the prompt level.

One more thing worth noting: rerank before you send to the LLM. A fast cosine search returns the 5 closest chunks by vector distance, but distance does not always equal usefulness. A cross-encoder reranker (Cohere Rerank costs about $1 per 1,000 queries) takes your top-20 candidates and scores them for actual relevance before you trim to 5. The quality jump is noticeable. Skip the reranker while prototyping, add it before you hit production.

6. Two gotchas that bite everyone

Chunk size drives recall more than index parameters

Most teams spend hours tuning HNSW m and ef_construction and see marginal gains. The actual lever is chunk size and overlap. A chunk that is too short loses context (the model cannot answer a cross-sentence question). A chunk that is too long pulls in noise, dilutes the embedding, and wastes context window in the LLM call. Run a quick eval: take 20 representative questions, retrieve top-5, then manually score whether the answer appeared in the returned chunks. Adjust chunk size in 100-word steps until recall tops 85%. Then tune the index.

Build the index after bulk loading, not before

HNSW indexing at insert time is slow. If you load 500,000 documents and the HNSW index exists, every INSERT pays the graph update cost. The fast path: load all rows with the index dropped, then build it once with CREATE INDEX. On a table of 500,000 rows with 1,536-dim embeddings, a cold HNSW build takes roughly 8 to 12 minutes on 4 vCPUs. That is far cheaper than the cumulative insert overhead.

-- drop the index before bulk load
DROP INDEX IF EXISTS documents_embedding_idx;

-- ... run your ingest pipeline ...

-- rebuild once after load
CREATE INDEX documents_embedding_idx
  ON documents USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

The bottom line

The full pipeline is about 120 lines of TypeScript and three SQL statements. pgvector 0.8.x is stable enough for production, HNSW is the right default index for most teams, and the two things that matter most for answer quality are chunk size and staying consistent between embed-at-ingest and embed-at-query time (same model, same preprocessing). Dedicated vector DBs are not wrong, they are just a layer you do not need until your row count passes 50M or your recall requirements get strict enough to warrant a tuning team.

What chunk size worked best for your use case? Drop it in the comments.

GDS K S Β· thegdsks.com Β· follow on X @thegdsks

Good retrieval beats a better model every time.