The Architecture Of Local-First Web Development

Last October, I was sitting in a hotel room in Lisbon, the night before I was supposed to demo a project management tool my team had spent four months building. The hotel Wi-Fi was doing that thing where it connects but nothing actually loads. And I watched our app, this thing I was genuinely proud of, render a blank screen with a spinner. Then a timeout error. Then nothing.

I pulled out my phone, tethered to cellular, and got a shaky connection. The app loaded, but every click was a two-second wait. Create a task? Spinner. Move a task between columns? Spinner. I sat there thinking: we built a front end in React, a back end in Node, a Postgres database, a Redis cache, a GraphQL API with six resolvers just for the task board. All that infrastructure, and the damn thing can’t show me my own data without a round-trip to a server 3,000 miles away.

That was the night I started seriously looking at local-first architecture. Not because I read a blog post or saw a tweet. Because I was embarrassed.

I want to be upfront about something: I spent the first year or so dismissing local-first as academic. I read the Ink & Switch “Local-First Software” paper when it came out in 2019 and thought, “Cool research, not practical for real apps.” I was wrong. The tooling in 2019 genuinely wasn’t ready. But I was also being lazy, defaulting to the architecture I already knew. The paper laid out seven ideals for software: fast, multi-device, offline, collaboration, longevity, privacy, user ownership. And I remember thinking those sounded like a wish list, not engineering requirements.

Seven years later, I’ve shipped three production apps using local-first patterns. I’ve also ripped local-first out of two projects where it was the wrong call. I have opinions. Some of them are probably wrong. But they’re earned.

So here’s what I actually think about building local-first web apps in 2026, written for developers who’ve been doing this long enough to be skeptical of silver bullets.

What “Local-First” Actually Means (And The Confusion That Won’t Die)

I need to clear something up because I keep having this conversation at meetups. Local-first is not offline-first. It’s not “add a service worker and call it a day.” It’s not a synonym for PWA. I’ve seen all of these conflated in conference talks, and it drives me a little crazy.

Offline-first means your app handles network loss gracefully, but the server is still the source of truth. When the network comes back, the server wins. Cache-first (service workers caching responses) is a performance optimization. You’re serving stale data faster, which is great, but you haven’t changed who owns the data. PWAs are a delivery mechanism: installable, cached, push notifications. None of these is a data architecture.

Local-first is a data architecture. Your user’s device holds the primary copy of their data. The app reads and writes to a local database. Renders instantly. Syncs with servers or other devices in the background. The server, when it exists, is a sync peer with some special authority (authentication, backup, access control). But it’s not the gatekeeper.

The Ink & Switch paper defined seven ideals, and I think they still hold up. But the one that matters most in practice, the one that changes how you build everything, is this:

The client is not a thin view requesting permission to show data. The client is a node in a distributed system with its own database.

That distinction sounds subtle. It isn’t. It changes your entire stack.

Be Honest Early: When You Should Not Do This

I’m putting this near the top because I’ve watched too many developers (including myself, once) get excited about a new architecture and shoehorn it into projects where it doesn’t belong. I wasted about six weeks trying to make a local-first approach work for an internal analytics dashboard at a previous job. My colleague Sarah finally pulled me aside and said, “The data is generated on the server. There’s nothing to replicate to the client. What are you doing?” She was right.

Local-first is a bad fit when your data is primarily server-generated. Analytics dashboards, social media feeds, search results: the server produces this data, so the client consuming it via API requests is completely fine.

It’s wrong for systems that need strong transactional consistency. Banking, payment processing, and inventory management. If two people try to buy the last item in stock, you need a single authoritative database making that decision with ACID guarantees. Eventual consistency will lose you money, or worse.

It’s overkill for simple CRUD apps with no offline or collaboration needs. If you’re building an internal admin panel used by five people in an office with good internet, adding a sync engine is over-engineering. And it’s physically impractical for massive datasets that won’t fit on client devices.

But here’s where it shines: note-taking, document editing, collaborative design tools, project management, field apps with unreliable connectivity, basically anything where data privacy is a selling point, as well as anything with real-time collaboration. In other words, it’s great for user-generated data that benefits from instant interaction and should survive the server going down.

One more thing I wish someone had told me earlier: you don’t have to go all-in. I’ve had the best results using local-first for specific features within otherwise traditional apps. Offline drafts in a blog editor. Real-time collaborative notes inside a project management tool that’s otherwise standard REST.

The “spectrum of local-first” is a real thing, and starting with one feature is how I’d recommend anyone begin.

Replicas, Not Requests

If you’ve used Git, you already understand the mental model.

SVN (remember SVN?) was centralized. One server. You check out files, make changes, and commit to the server. Server down? Can’t commit. Can’t even see history.

Git gave every developer a full clone. You commit locally, branch locally, and merge locally. Push and pull when you’re ready. The remote repository is important, but it’s not the only copy of the truth.

Local-first web development is Git for application data. Every client device holds a replica (full or partial) of the relevant data. Writes happen locally. Sync is push/pull in the background. Conflicts get resolved through defined merge strategies.

I remember the first time this clicked for me in practice. I was prototyping a task board, and I wrote a function to add a task. In our old architecture, it would be:

  1. POST to API.
  2. Wait for the response.
  3. If success, update the local state.
  4. If failure, show error toast and maybe roll back optimistic update.

In the local-first version, it was: write to local SQLite, done. The UI updated instantly because it was reading from the same local database. Sync happened whenever. No loading state, no error handling for the write itself, no optimistic update logic (because there’s nothing to be “optimistic” about; the local write is the state).

The implications ripple through everything. You don’t need React Query or SWR for data fetching, because you’re not fetching. You don’t need Redux or Zustand for server-derived state, because the local database is your state. Your routing doesn’t trigger API calls. Authentication works differently because the server isn’t checking permissions on every read.

Here’s a visual comparison that might help if you’re the kind of person (like me) who thinks spatially:

On the left, every user interaction is a round-trip. Click, wait, render. On the right, reads and writes hit the local database directly. The sync server is still there, but it’s doing its work in the background. The user never waits for it. That’s the fundamental shift.

But I’m getting ahead of myself. Before we can talk about sync and conflicts, we need to talk about where the data actually lives on the client.

Where Data Lives on the Client

Forget localStorage. It’s synchronous (blocks the main thread), caps at 5-10 MB, and only stores strings. It’s fine for a theme preference. It’s not a database.

IndexedDB is the workhorse that nobody loves. It’s in every browser, it’s asynchronous, it can handle hundreds of megabytes, and its API is absolutely miserable to work with. I’ve used it directly a grand total of once. Now I use it through abstractions or, more often, I don’t use it at all.

Because the real story in 2026 is SQLite running in the browser via WebAssembly.

I know that sounds like a party trick, but it’s not. SQLite compiled to WASM, persisted to the Origin Private File System (OPFS), gives you a real relational database in the browser. Full SQL queries. Transactions. Indexes. The works.

OPFS is the newer API that makes this practical. It gives web apps a sandboxed file system with high-performance synchronous access (in Web Workers), which is exactly what SQLite needs. Before OPFS, you could run SQLite in memory and manually persist to IndexedDB, which worked but was slow and fragile.

Here’s roughly what initialization looks like in a real project (I’m using wa-sqlite here, which is the library I’ve had the best luck with):

import { SQLiteAPI } from 'wa-sqlite';
import { OPFSCoopSyncVFS } from 'wa-sqlite/src/examples/OPFSCoopSyncVFS.js';

async function initDatabase() {
  const module = await SQLiteAPI.initialize();
  const vfs = new OPFSCoopSyncVFS('pm-tool-db');
  await vfs.initialize(module);

  const db = await module.open_v2('workspace.db');

  // HACK: wa-sqlite doesn't handle concurrent writes well on Safari,
  // so we serialize through a queue. See vlcn-io/wa-sqlite#247
  await module.exec(db, PRAGMA journal_mode=WAL);

  await module.exec(db, CREATE TABLE IF NOT EXISTS tasks (
      id TEXT PRIMARY KEY,
      title TEXT NOT NULL,
      status TEXT DEFAULT 'backlog',
      assignee_id TEXT,
      project_id TEXT NOT NULL,
      position REAL DEFAULT 0,
      created_at TEXT DEFAULT (datetime('now')),
      updated_at TEXT DEFAULT (datetime('now'))
    ));

  return db;
}

In production, I wrap all database access in a write queue that serializes mutations. I also log every failed write to Sentry with the full SQL statement (scrubbed of PII, obviously) because debugging database issues in a user’s browser is hell without that telemetry.

A gotcha I wasted almost two days on: Safari’s OPFS implementation behaves differently from Chrome’s in subtle ways. Specifically, I hit a bug where createSyncAccessHandle() would silently fail in certain iframe contexts on Safari 18. There’s no error, no exception. It just doesn’t work. I ended up falling back to IndexedDB-backed persistence on Safari, which was slower but at least functioned. (I’m told Safari 19/26 fixes this, but I haven’t verified it yet.)

Quick comparison of the options I’ve actually used:

Storage Good For Watch Out For
IndexedDB Broad compatibility, moderate data Terrible DX, no SQL, verbose
OPFS + SQLite WASM Relational data, complex queries, serious apps Safari quirks, ~400KB bundle addition
PGlite (Postgres in WASM) Full Postgres compatibility on client Newer, larger bundle, still maturing

I’ve also tried cr-sqlite, which adds CRDT column support directly to SQLite tables. Clever idea, but I found it too early-stage for production use when I evaluated it in late 2025. The merge semantics were sometimes surprising, and debugging CRDT state inside SQLite was painful. I’d revisit it later this year.

The Part That’s Actually Hard

Storing data locally is a solved problem. Syncing it reliably across devices and users is where you earn your gray hairs.

When multiple replicas can independently read and write, you need a mechanism to reconcile changes. There are basically four approaches, and I’ve used three of them.

CRDTs (Conflict-Free Replicated Data Types) are data structures designed so that concurrent edits can always be merged without conflicts, mathematically guaranteed. Yjs is the most popular implementation in JavaScript, and it’s genuinely excellent for real-time collaborative text editing. I used it to build a collaborative document editor at my last company, and the experience was mostly good, though I’ll get into the pain points in the conflict resolution section.

Here’s what setting up a shared Yjs document looks like in practice:

import * as Y from 'yjs';
import { WebsocketProvider } from 'y-websocket';

const ydoc = new Y.Doc();

const provider = new WebsocketProvider(
  'wss://sync.our-app.dev',
  'workspace-a1b2c3d4',
  ydoc
);

const tasks = ydoc.getMap('tasks');

// Add a task
const task = new Y.Map();
task.set('title', 'Review Q3 roadmap draft');
task.set('completed', false);
task.set('assignee', 'maria');
// TODO: type this properly once; yjs exports better TS types
// for nested maps. For now, this works fine.
tasks.set('f47ac10b-58cc-4372-a567-0e02b2c3d479', task as any);

tasks.observeDeep(() => {
  // Re-render UI. In practice, I debounce this to ~16ms
  // because observeDeep fires a LOT during active collaboration
  renderTaskList(tasks.toJSON());
});

Automerge is the other major CRDT library, backed by Rust and with a document-oriented model. I’ve used it less, but I know teams who swear by it. Loro is newer, Rust-based, and claims better performance. I haven’t shipped anything with Loro yet.

Database replication is the other big approach, and honestly, for most apps that don’t need Google Docs-style real-time text editing, I think it’s the better choice. The idea is straightforward: replicate rows between a server database (Postgres) and a client database (SQLite) with a sync engine managing the plumbing.

PowerSync does this well. It gives you one-way replication from Postgres to client SQLite with a write-back path for mutations. ElectricSQL is more ambitious, going for full active-active sync between Postgres and SQLite. I’ve used PowerSync in production and ElectricSQL in prototypes. PowerSync felt more stable when I evaluated them both in early 2026, but ElectricSQL’s approach is more powerful if they nail the execution.

Triplit takes a different angle entirely: it’s a full-stack database with sync built in, so you don’t think about “client DB” and “server DB” separately. I haven’t tried it beyond a weekend prototype, but the developer experience was surprisingly nice.

Event sourcing (syncing a log of mutations rather than the current state) is the approach LiveStore takes. I find it intellectually appealing and occasionally useful, but in practice, I’ve found that reconstructing state from an event log adds complexity that most apps don’t need. My controversial opinion: Event sourcing is over-recommended for application development. It’s great for audit logs and certain domains, but for a task board? Just sync the rows.

Not everyone will agree with that. I know event sourcing has passionate advocates, and I’ve been told I’m wrong about this at least twice at conferences. Maybe I just haven’t built the right app for it yet.

Conflicts: The Thing Everyone’s Afraid Of

I used to think conflict resolution was a terrifying, unsolvable problem. After building three apps that handle it, I’d revise that to: it’s a manageable problem that requires you to think carefully about your specific data model, and most developers overthink it.

Conflicts happen when two replicas modify the same data without seeing each other’s changes. User A edits a task title on their phone while offline. User B edits the same title on their laptop. Both come back online. Now what?

My first attempt at handling this was embarrassingly naive:

// My first try. Don't do this.
function resolveConflict(local: any, remote: any) {
  // just... take the remote one? sure?
  return remote;
}

The problem is obvious: local changes get silently dropped. User A edits a title, syncs, and their edit vanishes. They don’t even know it happened.

What actually works for most cases is last-write-wins (LWW) at the field level, not the record level. If User A changes the title and User B changes the due date, you keep both changes because they touched different fields. You only have a real conflict when both modified the same field, and then you pick the later timestamp.

interface FieldValue {
  value: string | number | boolean;
  // ISO timestamp with enough precision to break most ties
  updatedAt: string;
  // Client ID as tiebreaker when timestamps match.
  // This happens more often than you'd think.
  clientId: string;
}

function pickWinner(a: FieldValue, b: FieldValue): FieldValue {
  const timeA = new Date(a.updatedAt).getTime();
  const timeB = new Date(b.updatedAt).getTime();
  if (timeA !== timeB) return timeA > timeB ? a : b;
  // Deterministic tiebreaker when timestamps match
  return a.clientId > b.clientId ? a : b;
}

// In practice, I apply this per-field across the whole record.
function mergeTask(local: Record<string, FieldValue>, remote: Record<string, FieldValue>) {
  const merged: Record<string, FieldValue> = {};
  const allKeys = new Set([...Object.keys(local), ...Object.keys(remote)]);
  for (const key of allKeys) {
    if (!local[key]) { merged[key] = remote[key]; continue; }
    if (!remote[key]) { merged[key] = local[key]; continue; }
    merged[key] = pickWinner(local[key], remote[key]);
  }
  return merged;
}

In our production app, this handles about 95% of conflicts without any user-visible issues. For the remaining cases (two people editing the same text field), LWW means one person’s edit silently wins. For a task title? Honestly, that’s usually fine. For a document body? No. That’s where CRDTs earn their keep.

But there’s a subtler problem I didn’t appreciate until I hit it: semantic conflicts. Data merges cleanly at the structural level, but the result is nonsensical. Two users, both offline, book the same 2 PM meeting slot with different meetings. Field-level merge accepts both writes because they’re writing to different records. No structural conflict. But you’ve got a double-booking, and your merge function has no idea that’s a problem.

Semantic conflicts require application-level validation, and that has to happen on the server during sync. Your sync engine merges the data structurally, but your server needs to check domain invariants before accepting the result. The approach I’ve landed on (after getting it wrong twice) is: validate on the server during the write-back phase, but flag violations rather than silently rejecting them.

Here’s what I mean. When the client pushes mutations to the server during sync, the server runs them through a constraint validation layer before applying them to Postgres:

interface SyncViolation {
  type: 'scheduling_conflict' | 'capacity_exceeded' | 'stale_assignment';
  recordId: string;
  description: string;
  // The conflicting records so the client can show context
  conflictingRecords: string[];
  // When was this violation detected
  detectedAt: string;
}

async function validateSyncBatch(
  mutations: SyncMutation[],
  serverDb: Database
): Promise<{ accepted: SyncMutation[]; violations: SyncViolation[] }> {
  const accepted: SyncMutation[] = [];
  const violations: SyncViolation[] = [];

  for (const mutation of mutations) {
    if (mutation.table === 'calendar_events') {
      // Check for double-booking
      const overlapping = await serverDb.query(
        SELECT id, title FROM calendar&#95;events
         WHERE room&#95;id = ? AND id != ?
         AND start&#95;time &lt; ? AND end&#95;time &gt; ?,
        [mutation.data.room_id, mutation.data.id,
         mutation.data.end_time, mutation.data.start_time]
      );

      if (overlapping.length > 0) {
        violations.push({
          type: 'scheduling_conflict',
          recordId: mutation.data.id,
          description: Conflicts with "${overlapping[0].title}",
          conflictingRecords: overlapping.map(r => r.id),
          detectedAt: new Date().toISOString()
        });
        // Still accept the write, but flag it
        // The alternative is rejecting it, but then the user's
        // local state and server state diverge, and that's worse
        accepted.push(mutation);
        continue;
      }
    }
    accepted.push(mutation);
  }

  return { accepted, violations };
}

The key decision here — and I went back and forth on this — is that we accept the conflicting write and flag it, rather than rejecting it outright. If you reject it, the user’s local database has a record that the server refuses to acknowledge, and now you’re in a state divergence situation that’s genuinely hard to recover from. I tried the rejection approach first, and it led to ghost records on the client that users couldn’t delete because they didn’t exist on the server. Nightmare.

So instead, the server accepts the write, stores the violation, and syncs the violation back to the client. The client shows a non-blocking notification: “Your meeting ‘Q3 Planning’ conflicts with ‘Design Review’ in Room B at 2 PM. Tap to resolve.” The user taps, sees both meetings, and picks one to reschedule or cancel. The resolution is a normal write that syncs back.

Is this perfect? No. There’s a window between when the violation is created and when the user resolves it, where both conflicting records exist. For meeting rooms, that’s tolerable. For something like inventory management where two people “buy” the last item, that window is unacceptable, and that’s exactly why I said earlier that local-first is wrong for systems requiring strong transactional consistency.

I’m still iterating on this pattern. The violation table grows if users ignore notifications (we expire them after 72 hours, which feels arbitrary). And deciding which invariants to validate on the server requires you to essentially maintain a parallel set of business rules outside your client-side application logic. It’s not elegant. But it works, and it’s the best approach I’ve found for the class of apps I’m building. If you’ve built something cleaner, I genuinely want to hear about it.

For CRDTs like Yjs, conflict resolution at the character level (for text) works remarkably well. Two people typing in the same paragraph will see both sets of characters appear in a sensible order. But CRDT merging of structured data (maps, arrays, nested objects) can produce results that surprise you. I once watched a Yjs-backed task list duplicate items after a merge because two users had reordered the same list offline, and the CRDT’s list merge semantics interleaved their orderings. Technically correct. Practically confusing. We ended up adding a post-merge de-duplication step, which felt like a hack but solved the problem.

When should you surface conflicts to the user, Git-style? In my experience, almost never for typical app data. Users don’t want to resolve merge conflicts. They want the app to figure it out. The exception is high-stakes content: legal documents, medical records, anything where silently dropping an edit could cause real harm.

The Tools Right Now

I’m going to give you my honest read on the tools available as of mid-2026, with the caveat that this space is moving fast enough that some of this might be outdated by the time you read it.

Yjs is the most mature CRDT library. Production-ready, huge community, integrates with most collaborative editors (TipTap, BlockNote, Lexical). If you need real-time collaborative editing, start here.

Automerge is solid, Rust-backed, and takes a more document-oriented approach than Yjs. I’ve seen it used well in apps where the data model fits a document metaphor. Fewer integrations than Yjs, but the core is well-engineered.

PowerSync is what I’d recommend for teams that have an existing Postgres back-end and want to add offline support. It’s production-ready, the docs are good, and the mental model (Postgres syncs to client SQLite, client writes go through a defined upload path) is easy to reason about. In our app, initial sync for a workspace with around 5,000 tasks takes about 1.2 seconds on a decent connection and about 3.5 seconds on a throttled 3G simulation. That was acceptable for us.

ElectricSQL is going for something more ambitious: true active-active replication between Postgres and SQLite, with “shapes” defining what data syncs to which client. I want this to succeed because the developer experience in prototypes was excellent. But when I evaluated it for production in February 2026, I hit enough rough edges (particularly around shape management and reconnection behavior) that I went with PowerSync instead. I plan to revisit it.

Triplit impressed me in a weekend prototype. Full-stack database with sync built in, nice TypeScript API. I haven’t stress-tested it with real production load, and I’d want to before committing.

Zero (from Rocicorp, the Replicache people) is interesting because it takes a query-based approach to sync, which is different from the row-replication model. Replicache was sunset in favor of Zero, which tells you something about how fast approaches are evolving in this space. Worth watching, but I wouldn’t build on it yet for a production app.

TinyBase is a lightweight reactive store that’s great for smaller apps or prototyping. I used it for a personal side project (a reading tracker) and liked it a lot. Not sure I’d use it for a team-scale product.

PGlite (Postgres compiled to WASM) is wild. Same SQL dialect on client and server. Combined with ElectricSQL, you could theoretically run identical queries everywhere. I think this is where things are heading long-term, but PGlite’s bundle size and memory footprint are still concerns for mobile browsers.

One thing the Replicache sunset taught me: don’t bet your architecture on a single tool from a small company without a fallback plan. I keep my sync layer abstracted enough that I could swap engines in a few weeks, not months. I know that sounds like premature abstraction, but in a space this young, I think it’s just prudence.

Building A Real App: Architecture, Auth, And Migrations

I want to walk through how I actually structure a local-first app in practice, because the layer diagrams you see in blog posts rarely match what the code looks like.

My current stack for a collaborative project management tool looks like this:

  • UI: React components that never call fetch() for data reads.
  • Query layer: useLiveQuery hooks that subscribe to the local SQLite database and re-render automatically when data changes.
  • Local database: SQLite via wa-sqlite, persisted to OPFS.
  • Mutation layer: Plain INSERT/UPDATE/DELETE statements against local SQLite.
  • Sync: PowerSync managing replication between local SQLite and our Postgres back-end.
  • Server: Postgres, a Node.js auth service, and a small sync validation layer.

The component code ends up looking almost absurdly simple compared to what I used to write:

import { useLiveQuery } from '@powersync/react';
import { db } from '../lib/database';

function TaskBoard({ projectId }: { projectId: string }) {
  const tasks = useLiveQuery(
    SELECT &#42; FROM tasks WHERE project&#95;id = ? AND archived = 0 ORDER BY position,
    [projectId]
  );

  async function addTask(title: string) {
    await db.execute(
      INSERT INTO tasks (id, title, project&#95;id, position, created&#95;at)
       VALUES (?, ?, ?, ?, datetime('now')),
      [crypto.randomUUID(), title, projectId, tasks.length]
    );
    // That's it. useLiveQuery picks up the change automatically.
    // No invalidation, no refetch, no loading state.
  }

  // No isLoading check. Data is local. It's always there after the first sync.
  return (
    <div>
      {tasks.map(task => <TaskCard key={task.id} task={task} />)}
      <NewTaskInput onSubmit={addTask} />
    </div>
  );
}

Compare that to the React Query + REST equivalent, which would be at least twice the code and include loading states, error states, optimistic update logic with rollback, and cache invalidation. I don’t miss it.

Auth In A Local-First World

Authentication works roughly the same as traditional apps: JWT tokens, OAuth flows, and session management. The token authenticates the sync connection rather than every individual request. Offline access works because the data is already local. The user was authenticated when the data was originally synced.

Authorization is trickier, and I think most local-first articles under-explain this. You cannot sync your entire database to every client and rely on client-side code to hide unauthorized data. Someone will open DevTools, find the local SQLite file, and see everything. The client is not a trust boundary.

You enforce authorization at the sync layer. PowerSync has “sync rules” that define which rows go to which clients. ElectricSQL has “shapes.” Either way, the server only sends data that the user is authorized to see. When the client sends writes back, the server validates them against authorization rules before applying them to Postgres. If a user tries to modify something they shouldn’t, the server rejects it during sync.

I also want to mention end-to-end encryption (E2EE), because it pairs naturally with local-first. Since data lives on the client, you can encrypt it before sync. The server stores and relays encrypted blobs it can’t read. Apps like Anytype do this. We haven’t implemented E2EE in our current app, but it’s on the roadmap for when we handle more sensitive data.

Schema Migrations On A Thousand Devices

This one caught me off guard the first time. On the server, you run a migration against one database you control. On the client, every user has their own database that might be running any version of your schema, depending on when they last opened the app.

I use a simple migration runner that checks a version number at app startup:

const MIGRATIONS = [
  {
    version: 1,
    sql: CREATE TABLE IF NOT EXISTS tasks (
        id TEXT PRIMARY KEY,
        title TEXT NOT NULL,
        status TEXT DEFAULT 'backlog',
        project&#95;id TEXT NOT NULL,
        created&#95;at TEXT DEFAULT (datetime('now'))
      );
  },
  {
    version: 2,
    // Added priority and due_date in sprint 4
    sql: ALTER TABLE tasks ADD COLUMN priority INTEGER DEFAULT 0;
      ALTER TABLE tasks ADD COLUMN due&#95;date TEXT;
  },
  {
    version: 3,
    // Denormalized assignee name for offline display.
    // Yes, I know this is a trade-off. The JOIN was killing
    // performance on low-end Android devices.
    sql: ALTER TABLE tasks ADD COLUMN assignee&#95;name TEXT DEFAULT '';
  }
];

async function runMigrations(db: Database) {
  await db.execute(CREATE TABLE IF NOT EXISTS &#95;schema&#95;version (version INTEGER));

  const rows = await db.execute('SELECT version FROM _schema_version');
  const currentVersion = rows.length > 0 ? rows[0].version : 0;

  for (const migration of MIGRATIONS) {
    if (migration.version > currentVersion) {
      console.log(Migrating local DB to v${migration.version});
      await db.execute('BEGIN');
      try {
        await db.execute(migration.sql);
        await db.execute(
          'INSERT OR REPLACE INTO _schema_version (rowid, version) VALUES (1, ?)',
          [migration.version]
        );
        await db.execute('COMMIT');
      } catch (err) {
        await db.execute('ROLLBACK');
        // In production, this fires a Sentry alert with the
        // migration version and error details
        throw err;
      }
    }
  }
}

Design your migrations to be additive. New columns with defaults. New tables. Don’t rename or drop columns unless you absolutely must, because users running old app versions will still be syncing data, and your server needs to handle the mismatch. I learned this the hard way when I dropped a column that an older client was still writing to, which caused silent sync failures for about 200 users over a weekend. Not fun.

If I Were Starting A New Project Today

I get asked this a lot, so here’s my current answer. It changes every six months or so.

For a collaborative app with real-time features and offline support, I’d start with: React on the front end, PowerSync for sync, SQLite via wa-sqlite on the client (persisted to OPFS with IndexedDB fallback for Safari), and Supabase (which gives me Postgres, auth, and row-level security out of the box). I’d use Yjs only if I needed rich text collaboration, and I’d avoid it if I didn’t, because CRDTs add meaningful complexity to your data model.

For a simpler app where I mostly need offline support and instant reads but collaboration is secondary, I might skip the sync engine entirely and just use a local SQLite database with a custom sync layer that pushes/pulls from a REST API. I know that sounds like reinventing the wheel, but for simple cases, a custom sync that you fully understand is better than a general-purpose sync engine that adds concepts you don’t need.

I would not currently use ElectricSQL or Zero for production, not because they’re bad, but because I want another 6-12 months of maturity before I’d trust them for something I’m on-call for. I’ve been burned before by building on early-stage infrastructure (I was an early Meteor adopter, if that tells you anything) and I’m more cautious now about where I accept novelty risk.

Performance: What’s Actually Fast And What Hurts

Reads are instant. That’s not marketing. Querying a local SQLite database for a list of 500 tasks takes under two milliseconds on my M2 MacBook and about eight milliseconds on a mid-range Android phone. No network. No spinner. No loading state.

Writes are instant, too. INSERT INTO tasks runs locally, the UI updates reactively, and sync happens whenever. Users perceive writes as instantaneous because they are.

Initial sync is where you pay the cost. Bootstrapping the local replica on first load (or on a new device) means downloading potentially megabytes of data. In our app, a workspace with 5,000 tasks, 200 projects, and 50 users takes about 1.2 seconds on broadband and four to five seconds on a slow mobile connection. We mitigate this with partial sync (only sync the user’s active projects) and by showing a one-time “Setting up your workspace” screen during the first sync. After that initial sync, incremental updates are tiny.

Bundle size is a real concern. SQLite compiled to WASM adds roughly 400KB gzipped to your JavaScript bundle. That’s not trivial, especially if you care about Time to Interactive on mobile. I lazy-load the database module with dynamic import() so it doesn’t block the initial render.

Memory is the other gotcha. SQLite WASM runs in memory, and on mobile browsers with aggressive memory limits, a large database can cause tab crashes. I haven’t found a great solution for this beyond keeping the synced dataset small through partial sync and being aggressive about pruning old data.

Note: Speaking of memory issues, I’ve been reading Designing Data-Intensive Applications by Martin Kleppmann for the third time. Every re-read, I catch something new. If you haven’t read it and you’re thinking about distributed data, just stop and read it first.

Testing This Stuff

I’ll keep this brief because the honest answer is that testing local-first apps is harder than testing traditional apps, and the tooling isn’t great yet.

What works for me: unit tests for merge logic (these are pure functions, easy to test), integration tests that spin up two client instances in memory and verify they converge after concurrent edits, and Playwright E2E tests that use context.setOffline(true) to simulate offline/online transitions.

What I haven’t figured out well: reproducing bugs that only happen during conflict resolution with specific timing. When a user reports that a task “lost its description,” I often can’t reproduce it because I don’t know exactly what sequence of offline edits and sync events led to the conflict. I’ve started logging sync events in more detail (what was sent, what was received, what conflicts were detected, how they were resolved) and shipping those logs to our observability stack. It helps, but it’s not as clean as I’d like.

Property-based testing with something like fast-check is genuinely useful for CRDT logic. Generate random operation sequences, apply them in random orders, and assert convergence. I wish I’d started doing this earlier.

What I’m Watching, What Worries Me

I’m excited about where this is going. PGlite (full Postgres in the browser) feels like a glimpse of a future where the client/server data layer distinction just dissolves. You write SQL, it runs everywhere, sync is a runtime concern rather than an architectural decision. We’re not there yet, but you can see it from here.

I’m also watching the convergence of local-first and AI. Running models locally, keeping data on-device, using cloud AI only with explicit consent, and encrypted data. The privacy implications are compelling, and I think “your data never leaves your device” will become a real product differentiator as AI eats more of the software experience.

What worries me is fragmentation. Every sync engine uses its own protocol. There’s no standard. If ElectricSQL shuts down (it won’t, probably, but if), migrating to PowerSync isn’t trivial. I abstract my sync layer partly for this reason, but it still makes me nervous.

The web has standards for nearly everything. We don’t have one for sync, and I don’t see one emerging soon.

I’m also worried about the complexity budget. Local-first adds real architectural complexity: sync engines, conflict resolution, client-side migrations, partial replication, and auth at the sync boundary. For a team of experienced developers building the right kind of app, that complexity pays for itself many times over. For a team that just needs a CRUD app, it’s a trap.

I keep coming back to something a developer named Kevin said to me at a local-first meetup in Berlin last year:

“The best architecture is the one your team can debug at 2 AM.”

He’s right. If local-first makes your app faster, more reliable, and better for users, and your team understands how the sync works, go for it. If you’re adding it because it sounds cool and you don’t fully understand the failure modes yet, build a prototype first. Learn where it breaks. Then decide.

I’m building my fourth local-first app right now: a collaborative planning tool for small teams, with offline support and optional E2E encryption. It’s the most ambitious thing I’ve attempted with this architecture. I’ll write about how it goes.

If you’re starting out, pick one feature in your current app that would benefit from instant local reads and offline writes. Add a local SQLite database. Wire up reactive queries. See how it feels. I think you’ll have the same reaction I did: oh, this is how it should have always worked.

Further Reading

  • “Local-First Software” (Ink & Switch): This is still the best starting point.
  • “CRDTs: The hard parts”” (Martin Kleppmann, video): Martin’s talks on CRDTs are excellent.
  • The localfirstweb.dev community site: A good directory of tools.
  • PowerSync Documentation
  • ElectricSQL Documentation
  • Yjs Documentation
  • Automerge Documentation

Your AI Agent Knows What to Do But, Does It Know How?

The missing piece in most LLM applications, and how AgentSkills fix it. We’ve gotten pretty good at telling AI agents who they are.

You are an expert software engineer.“You are a seasoned marketing strategist.” We hand them a persona, dump in some context, maybe paste in a few examples and then we hit send and hope for the best.

And for simple tasks? That works fine.

But the moment you ask an agent to do something that involves multiple steps, decisions, and potential failure points things start to fall apart in ways that are hard to predict and even harder to debug.

The agent sounds confident. It just doesn’t behave consistently.

Here’s why and what to do about it.

The Gap Nobody Talks About

There’s a meaningful difference between knowing what needs to be done and knowing how to do it reliably.

A new employee on their first day might understand the goal perfectly “onboard this customer” but still flounder without a clear process. Do they send the welcome email first or set up the account? What if the system throws an error? Who do they escalate to?

Without a procedure, they improvise. Sometimes that works. Often it doesn’t.

LLM agents have the exact same problem.

You can give an agent all the context in the world about what it’s supposed to accomplish, and it’ll still invent its own process every single time it runs. Skipping steps. Hallucinating validations. Silently glossing over failures.

This is the gap and it’s where most LLM applications quietly break down.

Enter AgentSkills (and Why They’re a Big Deal)

AgentSkills also called Procedure Skills are exactly what they sound like: explicit, step-by-step instructions that teach an agent how to execute a task, not just what the task is.

Think of it less like a prompt and more like a standard operating procedure. A playbook. A binder on the shelf.

Industry leaders like Anthropic and Microsoft have both converged on this idea and formalized it around a portable format called SKILL.md. That’s not a coincidence it signals that the field is maturing from “prompt engineering” toward something more rigorous: procedure engineering.

What a Skill Actually Looks Like

A skill isn’t a single prompt tucked inside a system message. It’s a structured, self-contained unit of procedural knowledge a directory that bundles everything an agent needs to execute a specific type of task.

Here’s how it breaks down:

SKILL.md is the core the instruction manual. It contains YAML frontmatter that lets the agent automatically discover and select the right skill for the job, plus detailed step-by-step execution instructions.

scripts/ holds small, single purpose automation scripts (Python, Bash, Node.js) for the steps that LLMs consistently get wrong when left to their own devices. Repetitive operations, file handling, API calls these belong in code, not in natural language instructions.

resources/ contains domain specific knowledge company standards, data schemas, regulatory rules anything the agent needs to reference but shouldn’t be expected to memorize.

assets/ stores output templates. JSON schemas, document layouts, checklists so the agent produces consistent, structured results every time.

Put it all together and you get a self contained playbook instructions, tools, references, and templates in one place.

The Three Layers Most Teams Confuse

Before you can appreciate why skills matter, it helps to get clear on what they’re not:

Most teams have prompts. Many now have tools. Very few have skills.

A skill is where workflow intelligence lives. It’s the layer that answers the questions nobody bothers to write down: What comes first? What needs to be validated before moving on? What happens if this step fails?

Why Embedding All of This in a System Prompt Fails

The intuitive response to all of this is: “Can’t I just put the procedure in the system prompt?”

You can. And for a single, small workflow it might work okay. But it breaks down fast for a few predictable reasons.

Fragility. Large, instruction-heavy prompts are brittle. One small tweak to the wording can cascade into completely different agent behavior. There’s no modularity, no separation of concerns.

Token waste. Every time the agent runs, it pays the full token cost of every procedure even the ones that are completely irrelevant to the current task. At scale, this adds up fast.

Inconsistency. Without explicit validation steps (“check whether the file exists before editing it”), agents will invent shortcuts. They’ll confidently skip steps and never tell you they did it.

The result is the thing that makes AI in production so frustrating: agents that sound certain and behave unpredictably.

The Idea That Changes Everything: Progressive Disclosure

Here’s the mental model that ties this all together and it’s dead simple.

Imagine your new employee’s first day. You have two options:

Bad approach: Pile every binder all 50 of them on their desk. Tell them to read all of it before they start. By 11am they’re exhausted, overwhelmed, and can’t remember a thing.

Good approach: Put the binders on a shelf with clear labels. They glance at the labels, grab the one they need, read it, and do the job. Tomorrow, they grab a different one.

That’s Progressive Disclosure.

In practice, it works in two phases:

Discovery Phase The agent loads only skill names and short descriptions. A table of contents for procedural knowledge. Minimal tokens, maximum orientation.

Activation Phase When a user request matches a skill’s description, the agent loads the full SKILL.md and supporting assets into active memory. Only what’s needed, only when it’s needed.

The payoff is real: fewer hallucinations, lower token costs, better decisions when many skills exist simultaneously.

How to Design Skills That Actually Work

If you’re going to build skills, these principles are worth internalizing from day one:

Write in third person imperative. “Extract the text.” Not “You should try to extract the text.” Precision matters ambiguous instructions produce ambiguous behavior.

Define failure states explicitly. What should the agent do when a script errors? When a file is missing? When validation fails? If you don’t specify, the agent will improvise and you won’t like the improvisation.

Keep skills small and composable. A skill called “Marketing” is a red flag. A skill called “Ad Copy Generation” is useful. A skill called “SEO Analysis” is useful. Small, focused skills compose into larger workflows. Monolithic skills just become another fragile mega prompt in disguise.

When Does This Actually Matter?

Not every situation calls for this level of structure. If you have one skill and it’s always needed, just hand it to the agent upfront. Progressive disclosure doesn’t help when there’s nothing to disclose progressively.

But as your agent grows more tasks, more workflows, more edge cases the calculus changes:

  • 10 skills, one needed at a time? Huge savings. Show only what’s needed.
  • 50 skills? Progressive disclosure becomes essential. Otherwise the agent drowns.
  • Complex multi step workflows? Explicit failure states and validation steps stop being nice-to-have and become the difference between an agent that works and one that confidently fails.

The Shift Worth Making

AgentSkills represent a genuine change in how we think about building with LLMs.

We’re moving from prompt engineering which is ultimately about describing what we want to procedure engineering, which is about encoding how to reliably do it.

From probabilistic answers to deterministic execution.

From agents that talk about the work to agents that actually do it.

The tools and the personas are important. But without skills, you’ve hired a brilliant employee who has no idea how your company actually operates. Give them the binders. Label them clearly. Put them on the shelf.

That’s the whole idea.

The one takeaway: MCP gives the LLM the tools. Skills tell the LLM when to use them. Progressive disclosure means “show only what’s needed, when it’s needed.”

Thanks
Sreeni Ramadorai

[FabCon Atlanta 2026 Report] My Take on Fabric IQ Ontology

I attended FabCon Atlanta 2026.

I also created a few short videos that show the atmosphere of the venue, so feel free to check them out first.

FabCon Atlanta 2026 Day1-Day3 morning workshops 、KeyNone

FabCon Atlanta 2026 Day3 noon-Day5CoreNote session power hour

In this article, based on what I saw and heard at FabCon, I would like to focus especially on Ontology within Fabric IQ and share how I think we should understand it at this point in time.

Fabric IQ is described as a workload that organizes data in OneLake using business language, enabling analytics and AI agents to use that data with consistent meaning.

The Fabric IQ workload includes semantic models and Data Agents, and Ontology is one part of it.

I think many people may currently understand “Fabric IQ” as almost the same thing as “Ontology.”

That is not completely wrong. However, Fabric IQ is a broader term, so in this article I will mainly use the word “Ontology” to avoid confusion.

What is Fabric IQ (preview)?

The Atmosphere Around Fabric IQ at FabCon

At FabCon, I felt that everyone also highly interested in Fabric IQ.

At the same time, some of the questions were very basic, such as “What is IQ?”

In other words, my honest impression was that Fabric IQ is attracting a lot of attention, but even in the United States, understanding of it has not yet become widespread.

I also attended several IQ-related sessions. Based on the sessions I joined, I cannot say that I clearly saw exactly which real-world projects should use it and how.

Of course, there were Ontology demos, and there were discussions about how AI will be able to understand business meaning more easily and how the semantic layer will become more important. Officially, Ontology is also described as a way to represent a business in a machine-readable form through entities, properties, relationships, and rules.

However, to be honest, my current impression is that the concept itself is very attractive, but common implementation patterns are not yet widely understood.

My Conclusion First: I Would Still Take a Wait-and-See Approach for Production Use

My conclusion is that, at this point, I would still take a wait-and-see approach before placing Ontology at the center of a production environment.

The reason is simple.

First, it is still officially in preview.

Second, when it comes to improving the accuracy of Data Agents by giving them business context, I feel that many use cases can already be covered quite well by using semantic models.

A Common Misunderstanding

Ontology allows you to create entities as business objects and define relationships using natural language to represent business meaning.

On the other hand, based on the current specification, you cannot simply write natural-language descriptions for tables and columns inside Ontology in the same way you can with semantic model properties.

image.png

Of course, I am not saying that Ontology is unnecessary.

Rather, I believe Microsoft will continue to invest heavily in this area, and I personally have high expectations for it.

However, at least for now, I think the right stage is:

  • Development teams should try it in a test environment
  • Organizations should watch it as a future architecture option

On the other hand, I think it is still a little early to talk about adopting it broadly in production right away.

Semantic Models Will Continue to Be Important for AI

So, does that mean the semantic layer is still something for the future?

I do not think so.

Rather, even right now, building a well-designed semantic model is very effective. I also believe that even after Ontology becomes generally available in the future, the importance of semantic models will not disappear.

Officially, Ontology can be generated from semantic models. In other words, it feels more natural to see Ontology not as something that replaces semantic models, but as something that extends business meaning and relationships on top of semantic models as one of its foundations.

What Semantic Models Can Already Do Today

With the arrival of Data Agent, semantic models are no longer just models for BI.

You can specify a semantic model as a data source for a Data Agent, and through Data Agent customization, you can provide business metadata to AI.

For example:

  • Semantic model

    • Use the “Prep for AI” feature
    • Write the business meaning of tables and columns in properties such as table names, column names, table descriptions, and column descriptions
    • Predefine calculations and business logic with DAX
  • Data Agent

    • Clarify the role of the agent through instructions
    • Add descriptions for data sources so that the agent can choose the right source depending on the question
    • Use example query sets for expected questions
      • Note: this is not available for semantic models

For more details, I recommend starting with the following documentation.

Semantic model best practices for data agent

Best practices for configuring your data agent

Also, a Data Agent does not necessarily need to have only one data source.

When the data volume is large, or when you want to use example query sets, combining a semantic model with a lakehouse or warehouse can be a very realistic design.

For example:

  • Store large volumes of data in a lakehouse or warehouse
  • Organize the metrics and definitions you want AI to use in a semantic model

If you want to add business metadata to each table or column, my personal recommendation at this point is to write it in the semantic model properties.

Data Agent can refer to semantic model properties.

Related article:

Editing Semantic Model Metadata Properties from a Notebook with Semantic Link in Fabric

When Would Ontology Become Necessary?

At this point, you might think, “Then isn’t a semantic model enough?”

In fact, I think semantic models can cover a large part of many use cases.

That said, based on my current understanding, I feel that Ontology becomes especially useful in the following two scenarios.

In other words, if your use case does not fall into these two patterns, a semantic model may be enough for now.

1. When You Want to Query Across Multi-Layered Relationships Like a Graph

The first case is when you want to ask questions that go across multiple layers of relationships.

Semantic models can also express relationships. However, as the relationships become more complex, the thinking tends to become more JOIN-oriented.

Ontology, on the other hand, uses a graph-based approach, so it seems better suited to graph-like operations such as path exploration.

For example, imagine you have the following tables:

  • Customers
  • Orders
  • Products
  • Contracts
  • Support history
  • Responsible organizations
  • Related events

If you want to ask, “What is related to this customer?” across multiple business domains, Ontology seems like a more natural way to express that.

In other words, Ontology becomes meaningful when the relationships themselves are valuable, rather than when you only need simple aggregations or KPI questions.

2. When You Want to Treat Historical Data and Real-Time Data as One Business Entity

The second case is when you want to treat historical data and real-time data not as separate systems, but as the same business object.

Officially, Fabric IQ is described as a way to unify data in OneLake using business language and give consistent meaning to analytics and AI agents.

For example:

  • Recent order events stored in Eventhouse
  • Historical order data accumulated in Lakehouse

If you want to handle these together in the context of a single business entity such as “Order,” the idea of Ontology seems to be a very good fit.

This feels less like a simple BI model, or physical model, and more like a foundation that helps AI understand the meaning structure of the business, in other words, a logical model.

We Do Not Need to Rush Ontology. For Now, This Is a Preparation Phase

As I have written so far, I believe Ontology has great potential.

However, I personally do not think it is something that must be introduced as the highest priority right now.

Ontology can be seen as a mechanism for strengthening the business meaning layer afterward.

Therefore, rather than seeing it as a foundation that must be introduced from the beginning, it feels more natural to think of it as something that organizations can add after their data platform and semantic organization have reached a certain level of maturity.

In fact, even if you want to use Ontology, there will likely be many cases where the organization’s data itself is not yet ready.

For example:

  • Required tables do not exist
  • Key definitions and meanings differ across systems
  • Tables that should be related cannot be connected cleanly through relationships

In such a state, the problem exists before Ontology can even be built.

That is why I believe the most important thing right now is to prepare and organize the organization’s data so that it can take advantage of Ontology in the future.

Microsoft will likely continue to invest heavily in this area, and the concept of Ontology itself will become increasingly important.

In that sense, I think we should see the current phase not as “the time to rush Ontology into production,” but as a preparation period for creating the conditions where Ontology can be used effectively.

In addition, I also feel that building Ontology requires a surprisingly high level of skill.

It is not enough to have only data modeling knowledge.

You need both:

  • An understanding of the business meaning behind the organization’s operations and data
  • The data modeling knowledge required to turn that meaning into a structure

In other words, Ontology cannot be built only by the IT department.

At the same time, it also cannot be fully defined only by the business department.

Collaboration between IT and business will be important, and people who understand both sides to some extent will become increasingly valuable.

Bonus 1: Foundry IQ Already Feels More Practical

As a side note, based on my experience, Foundry IQ felt more practical at this point.

For example, use cases such as the following are relatively easy to imagine even now:

  • Using OneLake as a knowledge source
  • Using SharePoint as a knowledge source

Fabric Ontology still looks like something that may become very interesting in the future.

On the other hand, Foundry IQ already feels easier to connect to concrete use cases.

Of course, these two are not competitors. I believe they will become more connected over time.

Bonus 2: Data Agent Development Works Well with CI/CD and Should Use Git Integration

This is slightly separate from Ontology, but through FabCon, I was reminded again that Data Agent works very well with CI/CD.

Are you using Git integration in Fabric?

As mentioned earlier, when developing a Data Agent, you define items such as instructions, data source descriptions, and example query sets.

Among these, data source descriptions may not change very frequently.

However, I feel that instructions and example query sets are things that will continue to evolve once the agent starts being used.

For example, in actual operation, the following situations are likely to happen:

  • A user asks an unexpected question, and you want to add a query set for that pattern
  • You adjust the instruction prompt, but the accuracy becomes worse
  • You want to roll back to a previous version and check the behavior
  • You want to compare the previous version and the latest version while testing

In other words, a Data Agent is not something you configure once and forget.

It is something that should be continuously improved during operation.

That is why it works very well with Git integration, where you can manage change history, track differences, and roll back when necessary.

If you want to use Data Agent seriously in Fabric, I believe it is important not only to create the agent, but also to grow it with Git integration in mind.

Related articles:

Microsoft Fabric Git Integration × Azure DevOps: How to Release Fabric Items Across Different Tenants

How to Reflect Changes to Another Repository with Azure DevOps Pipeline: A Minimal Memo for Repo A → Repo B

Summary

Finally, here is my current understanding.

  • Expectations for Fabric IQ / Ontology are high
  • However, it is still in preview, so I would be cautious about using it in production at this stage
  • In many cases, the combination of semantic models and Data Agent is already quite effective
  • Ontology will become especially useful in scenarios such as:
    • Queries across multi-layered relationships
    • Use cases where accumulated data and real-time data need to be handled in one business context

I believe this is definitely an area where Microsoft will continue to invest.

Therefore, now is a good time to catch up on Ontology and prepare your organization’s data platform so that you can adopt it quickly when the right timing comes.

Thank you for reading this long article!

I Also Have a YouTube Channel!

https://www.youtube.com/@msfabricreijiotake

Introducing Cossmology: a Map of the Commercial OSS Universe

Chinstrap Community is proud to introduce Cossmology, a comprehensive, worldwide directory of over 1,000 commercial open source software (COSS) companies.

If you’re working on an OSS project around which you’ve built, or plan to build, a commercial offering, tell us about it by using our Submit feature.

We’ve also launched COSS Weekly, a newsletter that delivers all the latest COSS news, funding rounds, acquisitions, and other headlines to your inbox. No sales pitches, no ads, just all of the week’s most relevant news from the COSS universe (check out our COSS Weekly archive).

We’ve mirrored much of the Cossmology dataset on GitHub (repository, searchable index) so be sure to star us.

Feedback welcome!

Cossmology logo

Build a Secure API with Rails 8 – Part-1

Hi folks👋!

In this post I want to share something I wish I had when I started building APIs with Ruby on Rails: a practical guide that takes security seriously from the beginning.

When I built my first REST API, most tutorials I found were focused on getting something running quickly. They were great for learning the basics, but they usually skipped important topics like API versioning, authentication strategy, authorization, and security.

Even when using AI tools to generate a “secure API”, the result is often still insecure unless you already understand the threats you are trying to protect against. Security is not something you get automatically. You need to know what problems you are solving and why the protections matter.

I ended up reading API design books, OWASP documentation, and real-world breach reports before I finally felt like I understood what I was building, I’ve put all in practice. This post is the guide I wish I had back then.

In this series we are going to build a production-ready Rails 8 API with authentication, authorization, rate limiting, secure cookies, security headers, and other important protections. I also want to explain the reasoning behind each decision, not just copy-paste code without context.

Before writing any code, let’s first understand the main attack vectors we need to defend against.

The attack vectors we are defending against

1. XSS (Cross-Site Scripting)

🚨 Threat:
XSS happens when an attacker injects malicious JavaScript into content that later gets rendered in another user’s browser. In API-driven applications, one of the biggest risks is token theft. If JWTs are stored in localStorage, a malicious script can read and steal them immediately.

🛡️ Mitigation:
Avoid storing authentication tokens in localStorage or other browser-accessible storage. Instead, store them in secure HttpOnly cookies so JavaScript cannot access them. Cookies should also use the Secure and SameSite attributes. Any user-generated content rendered in the frontend should be properly escaped or sanitized.

2. SQL Injection

🚨 Threat:
SQL Injection happens when user input is inserted directly into a SQL query without proper sanitization. An attacker can manipulate the query to bypass authentication, read sensitive data, or modify the database.

🛡️ Mitigation:
Avoid interpolating user input directly into SQL queries. In Rails, prefer Active Record methods like where, find_by, and parameterized queries, which automatically sanitize input. If raw SQL is unavoidable, use bound parameters instead of string interpolation. You should also validate input, use strong parameters, and follow the principle of least privilege for database accounts.

3. CSRF (Cross-Site Request Forgery)

🚨 Threat:
CSRF happens when a malicious website tricks a logged-in user’s browser into sending authenticated requests to your application using automatically attached cookies.

This is especially important in Rails APIs using session cookies or JWTs stored in HttpOnly cookies. Even though JavaScript cannot read those cookies, the browser still sends them automatically with requests.

An attacker could potentially trigger actions like changing account settings, creating resources, or deleting data without the user realizing it.

🛡️ Mitigation:
Enable CSRF protection for any cookie-based authentication flow. In Rails, use protect_from_forgery and require valid CSRF tokens for state-changing requests like POST, PUT, PATCH, and DELETE.

Authentication cookies should also use:

  • HttpOnly

  • Secure

  • SameSite=Lax or SameSite=Strict

You should also validate Origin and Referer headers and keep CORS restricted to trusted frontend domains.

If the browser automatically sends authentication, CSRF protection still matters, even if the API itself is technically stateless.

4. Brute Force

🚨 Threat:
Brute force attacks happen when an attacker repeatedly tries large numbers of username and password combinations against your login endpoint.

This commonly targets login forms, password reset endpoints, and authentication APIs. Successful attacks can lead to account compromise, credential stuffing, and unnecessary server load.

🛡️ Mitigation:
Use rate limiting on authentication-related endpoints. In Rails, tools like Rack::Attack can throttle repeated requests by IP address, email, or both.

You should also:

  • temporarily lock accounts after repeated failures

  • require strong passwords

  • detect suspicious login activity

  • avoid revealing whether an account exists

  • consider CAPTCHA or step-up verification after suspicious behavior

5. User Enumeration

🚨 Threat:
User enumeration happens when an application reveals whether an account exists through different error messages.

For example:

  • “Email not found”

  • “Incorrect password”

An attacker can use these differences to discover valid accounts and later target them with brute force attacks, phishing, or credential stuffing.

🛡️ Mitigation:
Return consistent responses during login, password reset, and account recovery flows.

Instead of exposing whether the email exists, use generic responses such as:

  • “Invalid credentials”

  • “If an account exists, instructions have been sent”

You should also rate limit these endpoints and monitor repeated probing attempts.

6. IDOR (Insecure Direct Object Reference)

🚨 Threat:
IDOR happens when users can access resources they do not own by changing identifiers in URLs or request parameters.

For example:


User.find(params[:id])

If ownership checks are missing, changing /users/42 to /users/43 could expose another user’s data.

🛡️ Mitigation:
Always scope records through the authenticated user or an authorization policy.

Instead of:


Post.find(params[:id])

Prefer:


current_user.posts.find(params[:id])

Authorization libraries like Pundit or CanCanCan also help enforce access rules consistently across the application. I also avoid exposing raw database IDs directly to the frontend. Instead, I use Sqidsto generate less predictable public IDs, which helps reduce simple enumeration attacks.

7. Mass Assignment

🚨 Threat:
Mass assignment happens when the application accepts user input and blindly assigns it to model attributes.

An attacker could submit unexpected fields such as:


{

  "admin": true

}

If those fields are not filtered properly, the attacker may gain elevated privileges or modify protected data.

🛡️ Mitigation:
Use strong parameters in every controller.

In Rails, always whitelist allowed attributes using:


params.require(:user).permit(:email, :password)

Never pass raw params directly into create or update.

Sensitive fields like roles, permissions, ownership fields, or account status flags should never be user-assignable.

8. Excessive Data Exposure

🚨 Threat:
Excessive data exposure happens when an API returns more information than the client actually needs.

This often happens when entire Active Record objects are rendered directly into JSON responses.

Sensitive data such as password digests, internal IDs, permissions, API keys, or private metadata may accidentally leak through the API.

🛡️ Mitigation:
Only return the fields the client actually needs.

Instead of blindly rendering full objects:


render json: @user

Use serializers or custom JSON responses that explicitly define safe attributes.

Sensitive fields should never appear in API responses.

You should also regularly review serialized responses to make sure no internal data is leaking unintentionally.

9. MITM (Man-in-the-Middle)

🚨 Threat:
A Man-in-the-Middle attack happens when an attacker intercepts traffic between the client and server.

Without HTTPS, credentials, tokens, cookies, and other sensitive data can travel in plain text and be stolen or modified.

Attackers on the same network, malicious proxies, or compromised routers can hijack sessions or impersonate users.

🛡️ Mitigation:
Always enforce HTTPS.

In Rails, enable:


config.force_ssl = true

This redirects insecure requests and ensures cookies are only sent over encrypted connections.

Authentication cookies should also use the Secure and HttpOnly flags.

You should additionally enable HSTS headers and avoid loading insecure mixed-content resources.

10. Token Theft

🚨 Threat:
Token theft happens when an attacker gains access to a valid authentication token and uses it to impersonate a user.

Stolen JWTs can come from XSS attacks, insecure storage, leaked logs, browser extensions, compromised devices, or intercepted traffic.

If tokens remain valid for a long time, the attacker may keep access even after the user notices something is wrong.

🛡️ Mitigation:
Reduce token exposure and keep token lifetimes short.

Prefer storing tokens in secure HttpOnly cookies instead of localStorage.

Use:

  • short-lived access tokens

  • refresh token rotation

  • token revocation mechanisms

You should also avoid exposing tokens in logs or URLs and protect the application against XSS vulnerabilities.

11. Verbose Error Messages

🚨 Threat:
Verbose error messages expose internal application details to attackers.

Stack traces, database errors, framework versions, SQL queries, and file paths can all help attackers understand how the system works and make exploitation easier.

🛡️ Mitigation:
Production applications should return generic and safe error responses.

Instead of exposing internal exceptions, return messages such as:

  • Internal Server Error

  • Invalid request

Detailed errors should only be logged internally for debugging.

In Rails, make sure debug pages and detailed exceptions are disabled in production.

Final Thoughts

These are some of the most important security risks to think about when building APIs, and we will revisit them throughout this series as we implement each feature step by step.

In Part 2 we will start building the Rails 8 API from scratch and set up the project foundation correctly from the beginning, including authentication, secure configuration, and API structure.

Follow along if you want to get notified when the next part is published.

The week your AI coding tier got smaller

In 48 hours this week, two of the biggest AI coding platforms confirmed the same thing: your unlimited subscription was never sustainable for how you actually use it. The provider will be the one who decides when to cut you off.

Anthropic silently removed Claude Code from Pro on a “2% A/B test” (later reversed). Their Head of Growth justified it saying “usage has changed a lot and our current plans weren’t built for this.” GitHub paused new Copilot Pro signups and dropped Opus from Pro entirely.

One dev on HN said sending 3-4 messages to Opus 4.7 blew through their $20 plan limits and consumed $10 of extra usage.

Simon Willison framed the trust break: “Should I be taking a bet on Claude Code if I know that they might 5x the minimum price of the product?”

The structural takeaway for any team shipping AI features: the invoice is the governance boundary, not the plan page. The provider’s unit economics are now public. Every user is a small loss when they exceed the pricing assumption, and no vendor has found the pricing floor yet.

Teams that cannot meter their own spend per-customer, per-agent, per-task are now one pricing memo away from being unprofitable overnight.

The concrete fix:

  1. track your tokens (not the invoice’s)
  2. use per-customer attribution (so you know whose usage is killing you)
  3. implement hard budget caps at the agent level. Alerts don’t stop a runaway loop.

This is exactly what LLM Budget Guard is being built for.

Here is how a wrapper around the SDK produces per-customer token attribution without waiting for invoice day:

import { wrapOpenAI } from 'llmeter';
import OpenAI from 'openai';

const openai = wrapOpenAI(new OpenAI(), {
  projectId: 'prod-cluster',
  tenantId: 'cust_883'
});

// Cost is now tracked per customer automatically
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Generate report' }]
});

Track your costs early. Check out LLMeter to get started with attribution.

Developer Ecosystem Survey 2026 – Take Part in One of the Largest Developer Studies

Since 2017, we’ve been checking in with developers around the world to better understand how the industry is evolving and where software development is headed next.

This year marks the tenth edition of the Developer Ecosystem Survey, and we’d love for you to take part.

When we launched the first survey, Kotlin was just emerging, and AI coding tools were still years away. Today, they are part of everyday development.

Every year, tens of thousands of developers share their experiences, helping create one of the most comprehensive pictures of the tools, technologies, and challenges shaping modern development. The survey insights are widely used across the developer community – from researchers and industry analysts to teams building developer tools.

Whether you’re building large-scale systems, mobile apps, games, or experimenting with side projects, your perspective matters.

Set aside about 30 minutes, grab a drink, get comfortable, and tell us about your experience as a developer.

TAKE THE SURVEY

Have your say and get a chance to win one of these prizes:

  • MacBook Pro 16″
  • USD 1,000 Amazon Gift Card or alternative
  • USD 150 JetBrains Merchandise Store voucher
  • One-year JetBrains All Products Pack subscription
  • A guaranteed 30% discount for an individual JetBrains license

The more developers who participate, the clearer the picture we can build of today’s software development ecosystem. When you’re done, you’ll receive a personal referral link to share with friends and colleagues. The participants who bring in the most responses via their referral link will receive an additional prize.

As always, we’ll publish the results in detailed infographics and reports, and we’ll release the anonymized raw data for anyone who wants to explore the findings further.

Thank you for helping us capture a snapshot of where development is headed in 2026 – and for being part of the global developer community that has supported this initiative for the past decade.

IntelliJ IDEA 2025.3.5 is Out!

We’ve just released IntelliJ IDEA 2025.3.5. This version includes performance improvements for Spring projects – specifically for users who haven’t yet updated to v2026.1:

  • Searches for declared Spring beans are no longer triggered during typing or completion, ensuring code completion works smoothly in Spring-based projects. [IDEA-378966]

You can update to this version from inside the IDE, using the Toolbox App, or using snaps if you are a Ubuntu user. You can also download it from our website.

For a comprehensive overview of the fixes, see the release notes. If you spot any issues, let us know via the issue tracker.

Happy developing!