SSL/TLS Certificates Explained: HTTPS Security for Every Website

TL;DR
SSL/TLS certificates are the backbone of encrypted web communication, authenticating server identity and
protecting data in transit. With over 95% of web traffic now encrypted via HTTPS, understanding certificate
types, the TLS 1.3 handshake, certificate chains, and common pitfalls is essential for every developer and
sysadmin. This guide covers the full lifecycle — from issuance to renewal — with practical tooling.

📑 Table of Contents

  • What Is SSL/TLS?
  • The TLS 1.3 Handshake
  • Certificate Types
  • Certificate Chain of Trust
  • OCSP & Revocation
  • HSTS — HTTP Strict Transport Security
  • Certbot & Automation
  • Best Practices
  • Common Mistakes
  • Tools
  • References

What Is SSL/TLS?

Transport Layer Security (TLS) — the successor to the deprecated SSL protocol — provides encryption,
authentication, and integrity for data transmitted between clients and servers. As of 2024, TLS 1.3
accounts for over 60% of all encrypted connections, with TLS 1.2 covering most of the remainder.
SSL 2.0 and 3.0 are considered insecure and must never be used.

📖 Definition — A digital certificate is a digitally signed document that binds a public key to an identity (domain, organization). It is issued by a Certificate Authority (CA) after validating ownership.

The TLS 1.3 Handshake

TLS 1.3 (defined in RFC 8446) reduces the handshake from two round-trips to just one (1-RTT),
and supports 0-RTT resumption for returning clients, dramatically reducing latency.

ClientHello — Client sends supported cipher suites, key shares (ECDHE), and a random nonce.

ServerHello — Server selects cipher suite, sends its key share, and the handshake is encrypted from this point.

Server Parameters & Certificate — Server sends encrypted extensions, its certificate, and a CertificateVerify signature.

Finished — Both sides derive session keys and exchange Finished messages. Application data flows immediately.

💡 TLS 1.3 removed insecure algorithms: RSA key exchange, CBC ciphers, SHA-1, RC4, DES, and 3DES are all gone. Only AEAD ciphers (AES-GCM, ChaCha20-Poly1305) remain.

Certificate Types

Type Validation Use Case Issuance Time
DV Domain Validated Domain ownership only Blogs, personal sites, APIs Minutes
OV Organization Validated Domain + org identity Business websites 1–3 days
EV Extended Validation Rigorous legal/physical checks Banks, e-commerce 1–2 weeks
Wildcard Covers *.example.com Multi-subdomain projects Varies

⚠️ Wildcard certificates cover only one level of subdomain. *.example.com covers api.example.com but NOT v2.api.example.com.

Certificate Chain of Trust

A certificate chain links your server’s leaf certificate to a trusted root CA via one or more
intermediate CAs. Browsers and OS trust stores contain root CAs; the server must send the intermediates.

Leaf Certificate  (your domain)
    ↓  signed by
Intermediate CA   (e.g., R3 — Let's Encrypt)
    ↓  signed by
Root CA           (e.g., ISRG Root X1 — in trust stores)

🚫 Never serve only the leaf certificate without intermediates. This causes “unable to verify the first certificate” errors in clients that don’t have the intermediate cached.

OCSP & Revocation

When a private key is compromised, the certificate must be revoked. Two mechanisms exist:

  • CRL (Certificate Revocation List) — A downloadable list of revoked serial numbers. Can be large and slow.

  • OCSP (Online Certificate Status Protocol) — Real-time check against the CA. Preferred method.

Pro Tip: Enable OCSP Stapling on your server. The server fetches the OCSP response periodically and sends it during the TLS handshake, eliminating the client’s need to contact the CA — improving privacy and performance.

# Nginx — enable OCSP stapling
ssl_stapling on;
ssl_stapling_verify on;
resolver 1.1.1.1 8.8.8.8 valid=300s;
resolver_timeout 5s;

HSTS — HTTP Strict Transport Security

HSTS tells browsers to always use HTTPS for your domain, preventing protocol downgrade attacks and cookie hijacking.

# Nginx header
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload" always;

🎯 Submit your domain to the HSTS Preload List to have browsers enforce HTTPS before the first visit. Requires max-age ≥ 1 year, includeSubDomains, and preload.

Certbot & Automation

Certbot is the official ACME client from the EFF for obtaining and renewing free Let’s Encrypt certificates.

# Install and obtain a certificate (Nginx)
sudo apt install certbot python3-certbot-nginx
sudo certbot --nginx -d example.com -d www.example.com

# Auto-renewal (cron or systemd timer)
sudo certbot renew --dry-run

💡 Let’s Encrypt certificates are valid for 90 days. Certbot’s systemd timer renews at 60 days by default. Always test renewal with –dry-run first.

Best Practices

Use TLS 1.3 as the minimum version. Disable TLS 1.0 and 1.1 entirely.

Enable OCSP Stapling and configure a valid resolver.

Deploy HSTS with a long max-age and consider preloading.

Use ECDSA P-256 keys for better performance than RSA 2048.

Automate renewal — never let certificates expire manually.

Redirect all HTTP traffic to HTTPS with a 301 redirect.

Common Mistakes

Mistake Impact Fix
Missing intermediate certificate Broken chain on some clients Bundle intermediates in the cert file
Expired certificate Browser security warnings, lost trust Automate renewal with Certbot
Mixed content (HTTP resources on HTTPS page) Browser blocks insecure resources Use protocol-relative or HTTPS URLs
Allowing TLS 1.0/1.1 Vulnerable to POODLE, BEAST attacks Set ssl_protocols TLSv1.2 TLSv1.3;
Weak cipher suites Susceptible to brute-force or downgrade Use Mozilla SSL Configuration Generator

Tools

Check your SSL/TLS configuration with our built-in checker:

  • 🔧 SSL Certificate Checker — Verify certificate validity, chain, expiry, and protocol support.

References

  • 📄 RFC 8446 — The Transport Layer Security (TLS) Protocol Version 1.3

  • 📄 Let’s Encrypt Documentation

  • 📄 Mozilla Server Side TLS Guidelines

  • 📄 Mozilla SSL Configuration Generator

  • 📄 Certbot — EFF

  • 📄 HSTS Preload List Submission

🎯 Key Takeaway: Modern TLS is non-negotiable. Use TLS 1.3 with AEAD ciphers, automate certificate management with Certbot,
serve the full certificate chain, enable OCSP Stapling, and enforce HTTPS via HSTS. A misconfigured certificate
erodes user trust faster than almost any other infrastructure issue.

Originally published on StarNomina ToolBox. Try our free online tools — no signup required.

BIMI: Display Your Brand Logo in Email Inboxes

TL;DR
BIMI (Brand Indicators for Message Identification) is the final layer in the email authentication stack, allowing organizations to display their brand logo directly in recipients’ inboxes next to authenticated messages. Built on top of DMARC enforcement, BIMI transforms email authentication from an invisible infrastructure concern into a visible brand asset. This guide covers the DNS record format, SVG Tiny PS requirements, VMC certificates, provider support, cost analysis, and the full setup procedure — including when BIMI is (and isn’t) worth the investment.

📑 Table of Contents

  • How BIMI Works
  • DNS Record Format
  • SVG Tiny PS Requirements
  • VMC Certificates
  • Provider Support Matrix
  • DMARC Prerequisite
  • Cost Analysis
  • Setup Steps
  • Best Practices
  • Common Mistakes
  • Tools
  • Sources & References

1. How BIMI Works

BIMI leverages the existing email authentication stack (SPF, DKIM, DMARC) and adds a visual trust indicator. When a message passes DMARC with an enforcement policy, the receiving mail client looks up the sender’s BIMI DNS record to retrieve a logo URL and (optionally) a VMC certificate that validates brand ownership.

Message arrives
The receiver performs standard SPF, DKIM, and DMARC evaluation. The message must pass DMARC with p=quarantine or p=reject.

BIMI DNS lookup
The receiver queries default._bimi.example.com for a TXT record containing the logo URL and optional VMC URL.

Logo retrieval & VMC validation
The receiver fetches the SVG logo from the l= URL. If a a= (authority) URL is present, it fetches and validates the VMC certificate against the domain and logo.

Logo display
If all checks pass, the mail client displays the brand logo as the sender’s avatar. Without BIMI, a generic initial or silhouette is shown.

📖 Definition — BIMI (Brand Indicators for Message Identification) is an email specification that enables domain owners to display a verified brand logo in supporting email clients, contingent on DMARC enforcement and (for some providers) a Verified Mark Certificate (VMC).

💡 BIMI is not just cosmetic. Research from the BIMI Working Group shows that brand logos increase email open rates by 10–39% and significantly improve brand recall. It’s a deliverability and marketing asset as much as a security one.

2. DNS Record Format

A BIMI record is a DNS TXT record published at default._bimi.yourdomain.com:

default._bimi.example.com  TXT  "v=BIMI1; l=https://example.com/brand/logo.svg; a=https://example.com/brand/vmc.pem"
Tag Required Meaning Value
v Yes Version BIMI1
l Yes Logo URL (HTTPS) URL to SVG Tiny PS file
a No* Authority (VMC certificate URL) URL to PEM-encoded VMC

Gmail and Apple Mail **require* a VMC (a= tag) to display the logo. Without it, only providers like Fastmail and Yahoo display BIMI logos.

⚠️ The l= URL must use HTTPS with a valid TLS certificate. HTTP URLs are rejected. The SVG file must be served with Content-Type: image/svg+xml and appropriate CORS headers.

Selector Variants

The default selector covers all mail. You can publish additional selectors for different use cases (e.g., marketing._bimi.example.com), though receiver support for non-default selectors is limited.

# Default BIMI record — applies to all mail
default._bimi.example.com  TXT  "v=BIMI1; l=https://example.com/logo.svg; a=https://example.com/vmc.pem"

# To explicitly disable BIMI for a domain:
default._bimi.example.com  TXT  "v=BIMI1; l=;"

3. SVG Tiny PS Requirements

BIMI does not accept standard SVG files. The logo must conform to SVG Tiny PS (Portable/Secure), a restricted profile designed for security and consistent rendering across mail clients.

📖 Definition — SVG Tiny PS is a constrained subset of the SVG Tiny 1.2 specification, created specifically for BIMI. It removes scripting, external references, and other features that could pose security risks in email clients.

Key Requirements

Requirement Detail
Profile declaration Must include baseProfile="tiny-ps" and version="1.2"
Dimensions Square aspect ratio required; viewBox must be square
Title element Must contain a “ element
No scripting No “, event handlers, or JavaScript
No external references No xlink:href to external resources, no “ elements
No animations No , , or SMIL elements
No raster images No embedded PNG/JPEG via data URIs or external links
File size Should be under 32 KB (recommended limit)
Background Should have a solid background — transparent logos render poorly on varied email client backgrounds

Minimal Valid SVG Tiny PS Template

`plaintext

Example Corp Logo

E

`

🚫 Common SVG errors that break BIMI: Missing baseProfile=”tiny-ps”, non-square viewBox, embedded tags, xlink:href references, inline styles using url() for external resources, gradients referencing filters. Always validate with the BIMI Group’s SVG checker.

Pro Tip: Export your logo from a vector editor (Illustrator, Figma), then manually clean the SVG: remove metadata, comments, embedded fonts, and Illustrator-specific namespaces. Add the baseProfile=”tiny-ps” and version=”1.2″ attributes. Validate with the BIMI validator before publishing.

4. VMC Certificates

A Verified Mark Certificate (VMC) is an X.509 certificate that cryptographically binds your brand logo to your domain. It is issued by a Certificate Authority after verifying your trademark registration and domain ownership.

VMC Issuers

Certificate Authority Annual Cost (approx.) Trademark Requirement
DigiCert $1,299 – $1,499/year Registered trademark (USPTO, EUIPO, WIPO Madrid, etc.)
Entrust $1,299 – $1,499/year Registered trademark

💡 As of 2024, DigiCert and Entrust are the only two Certificate Authorities authorized to issue VMC certificates. The BIMI Working Group requires CAs to be members and follow strict validation procedures.

VMC Validation Requirements

  • Registered trademark — Your logo must be a registered trademark in an accepted jurisdiction (USPTO, EUIPO, CIPO, IP Australia, WIPO Madrid Protocol, and others).

  • Domain ownership — You must prove ownership/control of the domain specified in the certificate.

  • Logo match — The SVG file referenced in your BIMI record must match the trademarked logo in the VMC.

  • DMARC enforcement — Your domain must have p=quarantine or p=reject.

⚠️ The VMC issuance process typically takes 3–6 weeks due to trademark verification. Plan ahead — you cannot rush this step.

5. Provider Support Matrix

Mail Provider BIMI Support VMC Required? Notes
Gmail Yes Yes Full support since July 2021; requires VMC
Apple Mail Yes Yes Supported since iOS 16 / macOS Ventura
Yahoo/AOL Yes No Displays BIMI logos without VMC
Fastmail Yes No Early BIMI adopter; no VMC needed
Microsoft Outlook Partial Uses proprietary “Brand Indicators” via Microsoft 365 admin; not standard BIMI
Zoho Mail Yes No Supports BIMI without VMC
ProtonMail No No BIMI support as of 2025
Thunderbird No No BIMI support

1.8B+mailboxes support BIMI (Gmail + Apple Mail + Yahoo)

6. DMARC Prerequisite

BIMI has a hard dependency on DMARC enforcement. Your domain must have a DMARC record with p=quarantine or p=reject for BIMI logos to be displayed.

DMARC Policy BIMI Effect
p=none BIMI ignored — logo is not displayed
p=quarantine BIMI active — logo displayed for passing messages
p=reject BIMI active — logo displayed for passing messages

🎯 If you haven’t deployed DMARC yet, start there. Follow the phased rollout (p=none → p=quarantine → p=reject) before investing in BIMI and VMC. BIMI is the reward for achieving full email authentication maturity.

7. Cost Analysis

BIMI itself is free (it’s a DNS record). The costs come from the VMC certificate and preparation:

Item Cost Frequency
BIMI DNS record Free One-time setup
SVG Tiny PS logo creation $0 – $500 One-time (designer time or self-service)
Trademark registration (if not already registered) $250 – $2,000+ Initial filing + maintenance
VMC certificate (DigiCert or Entrust) $1,299 – $1,499 Annual renewal
DMARC enforcement (prerequisite) $0 – varies Ongoing monitoring & management

💡 Without VMC: For Yahoo, Fastmail, and Zoho, you can deploy BIMI for free (just a DNS record + SVG). For Gmail and Apple Mail (the vast majority of consumer mailboxes), you need a VMC. The total first-year cost with VMC is typically $1,500 – $3,500.

When BIMI Is Worth It

High email volume
If you send millions of emails monthly, a 10–39% increase in open rates easily justifies the VMC cost.

Strong brand recognition
Recognizable logos (retail, finance, SaaS) benefit most. A logo people don’t recognize adds no value.

Already have a trademark
If your logo is already registered, VMC cost is the only expense — the ROI is very favorable.

Phishing target
Financial institutions, e-commerce platforms, and government agencies that are frequently impersonated get anti-phishing benefits from visual brand verification.

⚠️ BIMI is NOT worth it if: You haven’t achieved p=reject DMARC yet, you send very low volume, your brand is new/unknown, or your logo isn’t trademarked and you don’t plan to trademark it.

8. Setup Steps

Achieve DMARC Enforcement
Ensure your domain has p=quarantine or p=reject with 100% alignment. BIMI requires DMARC to work.

Prepare SVG Tiny PS Logo
Convert your logo to SVG Tiny PS format. Square aspect ratio, no scripts, no external references, no raster images. Set baseProfile="tiny-ps" and version="1.2".

Validate SVG
Use the BIMI Group’s SVG validator or the BIMI Inspector tool to check compliance before publishing.

Obtain VMC (Optional/Required)
If targeting Gmail/Apple Mail, purchase a VMC from DigiCert or Entrust. Provide your trademark registration number and domain verification.

Host Assets
Upload the SVG and VMC PEM file to your web server over HTTPS. Ensure correct Content-Type headers and public accessibility.

Publish DNS Record
Add a TXT record at default._bimi.yourdomain.com with the v=BIMI1; l= and a= tags pointing to your hosted files.

Test & Verify
Send a test email to a Gmail account and check if the logo appears. Use BIMI Inspector to verify DNS, SVG, and VMC configuration.

`plaintext

Complete DNS configuration example:

1. DMARC record (prerequisite)

_dmarc.example.com TXT “v=DMARC1; p=reject; rua=mailto:dmarc@example.com”

2. BIMI record

default._bimi.example.com TXT “v=BIMI1; l=https://example.com/brand/logo.svg; a=https://example.com/brand/vmc.pem”
`

9. Best Practices

Validate SVG rigorously
Use the official BIMI Group validator. Even minor deviations from SVG Tiny PS will cause silent failures — no logo, no error.

Use a solid background
Transparent SVG backgrounds render differently across email clients. Use a solid brand-color background for consistent appearance.

Keep SVG under 32 KB
While not a hard limit, larger files may be rejected or slow to render. Optimize paths and remove unnecessary metadata.

Monitor DMARC continuously
BIMI vanishes if your DMARC policy drops below enforcement. A single misconfiguration can remove your logo from billions of inboxes.

Plan for VMC renewal
VMC certificates expire annually. Set a calendar reminder 30 days before expiration and renew early to avoid logo disappearance.

Cache and CDN considerations
Receivers cache your SVG logo. After updating, it may take days for the new version to propagate. Use a different filename or cache-busting query parameter.

10. Common Mistakes

🚫 Using standard SVG instead of SVG Tiny PS. Regular SVG files exported from Illustrator, Figma, or Inkscape include features not allowed in Tiny PS (gradients with filters, embedded images, metadata). The logo will silently fail to display.

🚫 Deploying BIMI with p=none DMARC. BIMI requires DMARC enforcement (quarantine or reject). With p=none, receivers ignore the BIMI record entirely.

⚠️ Non-square logo. BIMI requires a square aspect ratio. Rectangular logos will be rejected or cropped unpredictably by mail clients.

⚠️ Hosting SVG over HTTP. The l= tag must point to an HTTPS URL with a valid TLS certificate. HTTP URLs are rejected by all BIMI-supporting receivers.

⚠️ Expecting instant display. After publishing a BIMI record, it can take 24–72 hours for receiver caches to populate. Gmail specifically crawls BIMI records on its own schedule.

⚠️ Forgetting the VMC for Gmail. About 30% of all email goes to Gmail. Without a VMC, your BIMI setup covers only Yahoo, Fastmail, and smaller providers — a fraction of your audience.

11. Tools

Tool Purpose
BIMI Record Checker Look up BIMI DNS records, validate SVG URL, and check VMC presence

12. Sources & References

  • 📄 BIMI Group — Implementation Guide

  • 📄 BIMI Group — SVG Tiny PS Specification

  • 📄 Google Workspace — Set up BIMI

  • 📄 Google — BIMI requirements and troubleshooting

  • 📄 DigiCert — Verified Mark Certificates (VMC)

  • 📄 Entrust — Verified Mark Certificates (VMC)

  • 📄 RFC 7489 — DMARC (BIMI dependency)

  • 📄 RFC 6376 — DKIM (authentication layer)

  • 📄 RFC 7208 — SPF (authentication layer)

🎯 Key Takeaway: BIMI is the visible payoff of a mature email authentication stack. It requires DMARC enforcement (p=quarantine or p=reject), a logo in SVG Tiny PS format, and — for Gmail and Apple Mail — a Verified Mark Certificate (~$1,500/year). Deploy BIMI after you’ve achieved full DMARC enforcement, not before. For high-volume senders with recognized brands, the ROI in open rates and brand protection is substantial. For everyone else, get your DMARC house in order first — BIMI is the cherry on top.

Originally published on StarNomina ToolBox. Try our free online tools — no signup required.

DNS Propagation: How Long Does It Really Take? (With Technical Explanation)

TL;DR
DNS propagation — the time it takes for DNS changes to reach every resolver worldwide — is one of the most
misunderstood concepts in web operations. While changes can appear instant for some users, others may wait
up to 72 hours due to aggressive caching. Understanding the resolution flow, TTL mechanics, caching
layers, and pre-change strategies lets you execute DNS migrations with near-zero downtime.

📑 Table of Contents

  • What Is DNS Propagation?
  • The DNS Resolution Flow
  • TTL Mechanics & Caching
  • Caching Layers
  • Pre-Change Strategy
  • Anycast DNS
  • Real-World Propagation Timing
  • Best Practices
  • Common Mistakes
  • Tools
  • References

What Is DNS Propagation?

When you update a DNS record at your registrar or DNS provider, the change is immediately live on your
authoritative nameservers. However, recursive resolvers around the world have cached
the old record and will continue serving it until the cached entry’s TTL expires. The gradual process of
every resolver picking up the new record is called DNS propagation.

📖 Definition — DNS propagation is not a push mechanism — there is no broadcast. Each recursive resolver independently expires its cache based on the TTL from the last response it received from the authoritative server.

The DNS Resolution Flow

Understanding the full resolution path explains why propagation takes time and where caching occurs.

Stub Resolver — Your device’s OS-level resolver sends a query to the configured recursive resolver (e.g., your ISP, or 1.1.1.1 / 8.8.8.8).

Recursive Resolver — Checks its cache. If found and TTL hasn’t expired, returns the cached answer immediately. If not, begins iterative resolution.

Root Servers — The recursive resolver queries one of the 13 root server clusters, which responds with the TLD nameserver (e.g., .com NS).

TLD Nameserver — Returns the authoritative NS records for the specific domain (e.g., ns1.provider.com).

Authoritative Nameserver — Returns the actual record (A, CNAME, MX, etc.) with its TTL. The recursive resolver caches this response.

Response to Client — The recursive resolver returns the answer to your device, which may also cache it locally.

💡 Each step in the chain can cache results. Root and TLD NS records are cached for long periods (often 48 hours), but your domain’s records are cached according to their own TTL.

TTL Mechanics & Caching

TTL (Time To Live), defined in RFC 1035, is a 32-bit integer representing seconds. When a resolver
caches a record, it decrements the TTL over time. At zero, the entry is evicted and must be re-fetched.

; Example: A record with 1-hour TTL
example.com.    3600    IN    A    93.184.216.34

; After 2000 seconds, a resolver's cached copy has:
;   Remaining TTL = 3600 - 2000 = 1600 seconds
TTL (seconds) Human Readable Propagation Window
60 1 minute ~1–5 minutes globally
300 5 minutes ~5–15 minutes
3600 1 hour ~1–2 hours
86400 24 hours ~24–48 hours

⚠️ Some resolvers do not honor low TTLs. Certain ISP resolvers enforce a minimum TTL floor (commonly 300 seconds). RFC 2308 allows negative caching of NXDOMAIN responses for up to the SOA minimum TTL.

Caching Layers

DNS responses are cached at multiple levels, each with different eviction behavior:

Layer Location Cache Duration Flushable?
Browser Chrome, Firefox, etc. Up to 60 seconds (Chrome) Yes — chrome://net-internals/#dns
OS Windows/macOS/Linux stub resolver Varies (often honors TTL) Yes — ipconfig /flushdns
Router/LAN Home/office router, local DNS Varies widely Reboot router
ISP Resolver ISP’s recursive nameserver Honors TTL (usually) No — must wait for expiry
Public Resolver 1.1.1.1, 8.8.8.8, 9.9.9.9 Strictly honors TTL Cloudflare: purge cache tool

Pro Tip: During a migration, test from multiple resolvers. Use dig @1.1.1.1, dig @8.8.8.8, and dig @9.9.9.9 to see whether major public resolvers have picked up the change. Your ISP’s resolver may lag behind.

Pre-Change Strategy

The single most important technique for fast, smooth DNS changes is TTL pre-lowering.

48 hours before — Lower the TTL on the record you plan to change to 60–300 seconds. Wait for the old high TTL to expire from all caches.

Make the change — Update the DNS record to its new value. Because the TTL is now low, caches expire quickly.

Verify propagation — Use a global DNS checker to confirm the new value is seen from multiple locations worldwide.

After confirmation — Raise the TTL back to its normal production value (e.g., 3600 or 86400).

# Step 1: Lower TTL (48h before migration)
example.com.    60    IN    A    93.184.216.34    ; was 86400

# Step 2: Change record (migration day)
example.com.    60    IN    A    104.21.45.67     ; new server

# Step 3: After propagation confirmed, restore TTL
example.com.    3600  IN    A    104.21.45.67

🚫 Never skip the TTL pre-lowering step. If your record has a 24-hour TTL and you change it without lowering first, some users will be stuck on the old IP for up to 48 hours.

Anycast DNS

Anycast is a routing technique where the same IP address is announced from multiple geographic
locations. DNS providers like Cloudflare, Route 53, and Google Cloud DNS use anycast to route queries to
the nearest server, reducing latency and improving redundancy.

💡 Anycast means two users in different countries querying the same resolver IP (e.g., 1.1.1.1) may hit different physical servers with different cache states. This is why propagation appears inconsistent across regions.

Real-World Propagation Timing

Change Type Typical Duration Worst Case
A/AAAA record (low TTL pre-set) 1–10 minutes 30 minutes
A/AAAA record (high TTL, no prep) 4–24 hours 72 hours
NS record change (registrar) 12–24 hours 48 hours
New domain (fresh registration) Minutes–2 hours 24 hours
MX record change Follows TTL TTL + resolver floor

🎯 For zero-downtime migrations, keep the old server running until propagation completes. Both old and new servers should serve valid responses during the transition window.

Best Practices

Always pre-lower TTL 48 hours before any DNS change.

Keep old infrastructure running in parallel during the propagation window.

Use public resolvers (1.1.1.1, 8.8.8.8) for testing — they strictly honor TTLs.

Monitor propagation from multiple geographic locations, not just your local machine.

Use anycast DNS providers for lower query latency and faster cache refresh across regions.

Document your rollback plan before changing DNS — know the old values and how to revert.

Common Mistakes

Mistake Impact Fix
Not lowering TTL before migration Hours of stale DNS for some users Pre-lower to 60s, wait 48h, then change
Shutting down old server immediately Downtime for users still resolving old IP Keep old server live for 2× the old TTL
Testing only from local machine Local cache gives false positive Flush local cache + test from multiple resolvers
Forgetting to restore TTL after change Excessive queries to authoritative server, slower resolution Raise TTL back to 3600+ after propagation
Ignoring negative caching (RFC 2308) Deleted records linger as NXDOMAIN in caches Pre-create records before pointing traffic

Tools

Monitor and verify your DNS propagation in real time:

  • 🔧 DNS Lookup — Query any record type against specific resolvers.

  • 🔧 Global DNS Checker — Verify propagation status from 20+ worldwide locations simultaneously.

References

  • 📄 RFC 1035 — Domain Names: Implementation and Specification

  • 📄 RFC 2308 — Negative Caching of DNS Queries (DNS NCACHE)

  • 📄 Cloudflare — DNS Record Types

  • 📄 Cloudflare — What Is DNS Propagation?

  • 📄 Cloudflare — Purge 1.1.1.1 Cache

🎯 Key Takeaway: DNS propagation is not magic — it’s cache expiration. The single most effective technique is
TTL pre-lowering: drop the TTL to 60 seconds 48 hours before your change, make the update,
verify globally, then restore the TTL. Keep old infrastructure running during the transition window
and always test from multiple geographic vantage points, not just your local machine.

Originally published on StarNomina ToolBox. Try our free online tools — no signup required.

I Built an AI Code Reviewer That Uses Any LLM to Review Claude Code Output — Zero Dependencies, 7 Commands, Infinite Engines

I Built an AI Code Reviewer That Uses Any LLM to Review Claude Code Output — Zero Dependencies, 7 Commands, Infinite Engines

TL;DR: I built cc-review — a pure bash Claude Code skill that spins up any external LLM (Gemini, Ollama, DeepSeek, OpenAI) to independently review Claude’s own code output. No npm. No pip. Just 7 slash commands, a YAML config, and an uncomfortable truth about trusting AI to review itself. Repo here.

Here’s the uncomfortable truth nobody talks about: Claude is reviewing Claude’s code.

You vibe-code a multi-phase feature. You run /review. Claude reads its own output and says “looks good!” You ship. Two days later you’re debugging a race condition that any second pair of eyes would have caught in 30 seconds.

You didn’t get a code review. You got a mirror.

I hit this exact wall while building a multi-phase AI Second Brain project — an agentic system with memory modules, knowledge indexing, scheduled tasks, and a self-learning loop. Each phase produced hundreds of lines of generated code. I was using Claude Code for everything: architecting, implementing, reviewing. The confirmation bias was baked in.

I needed a reviewer with zero loyalty to the original author.

So I built cc-review: an open-source Claude Code skill that outsources your code review to any external LLM engine. Gemini reviews Claude’s work. Ollama stays local and private. DeepSeek brings a different training distribution. The engine is pluggable. The bash is pure. The cost, with Gemini’s free tier, is zero.

Here’s exactly how it works and how you can add it to your own Claude Code setup in under 10 minutes.

What We’re Building

cc-review is a Claude Code skill — a bash-powered plugin that extends Claude Code with new slash commands. When you trigger a review, it:

  1. Grabs your recent git diff (staged + unstaged changes)
  2. Routes the diff to an external LLM engine of your choice
  3. Scores the code on four dimensions: Completeness, Correctness, Quality, Security
  4. Returns structured feedback with line-level comments
  5. Optionally runs an adversarial review mode that actively tries to break your assumptions
Your Code Changes (git diff)
        │
        ▼
┌───────────────────┐
│   cc-review skill │  ← Pure bash, reads engines.yaml
└────────┬──────────┘
         │
    ┌────▼─────────────────────────────────┐
    │         engines.yaml router          │
    └──┬──────────┬───────────┬────────────┘
       │          │           │
   ┌───▼───┐  ┌───▼───┐  ┌───▼───────┐
   │Gemini │  │Ollama │  │ DeepSeek  │  ← any LLM, zero code changes
   └───────┘  └───────┘  └───────────┘
       │          │           │
       └──────────┼───────────┘
                  │
         ┌────────▼────────┐
         │ Scored Report   │  Completeness / Correctness
         │ (4 dimensions)  │  Quality / Security
         └─────────────────┘

Seven commands ship out of the box:

Command What It Does
/review Standard review using your default engine
/review-adversarial Skeptic mode — actively challenges your code
/review-result Show full output of last review
/review-status Check running/recent jobs
/review-setup Verify engine auth and readiness
/review-cancel Kill a running background job
/review-rescue Delegate investigation/fix to external engine

Prerequisites

  • Claude Code installed and running (>= 0.2.x)
  • git available in your shell
  • At least one of: a Gemini API key (free tier), Ollama running locally, or an OpenAI/DeepSeek key
  • Basic comfort with YAML config files

Step-by-Step

1. Install the Skill

Clone cc-review into your Claude Code skills directory:

git clone https://github.com/mudavathsrinivas/cc-review ~/.claude/skills/cc-review

Claude Code auto-discovers skills in ~/.claude/skills/. No import step. No config file edit. The skill system reads the directory on startup.

Verify it loaded:

# Inside Claude Code
/review-setup

You’ll see output like:

cc-review v1.0.0
─────────────────────────────────
Engine Check:
  gemini    ✓  (GEMINI_API_KEY set)
  ollama    ✓  (http://localhost:11434 reachable)
  deepseek  ✗  (DEEPSEEK_API_KEY not set)
  openai    ✗  (OPENAI_API_KEY not set)

Default engine: gemini
Ready to review.

2. Configure Your Engines

This is the part I’m most proud of. Every engine lives in a single YAML file — engines.yaml. Adding a new LLM requires zero code changes. You just describe it:

# ~/.claude/skills/cc-review/engines.yaml

default: gemini

engines:
  gemini:
    provider: google
    model: gemini-2.0-flash
    api_key_env: GEMINI_API_KEY
    endpoint: https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent
    max_tokens: 8192
    temperature: 0.3
    enabled: true

  ollama:
    provider: ollama
    model: llama3.2
    endpoint: http://localhost:11434/api/generate
    max_tokens: 4096
    temperature: 0.2
    enabled: true
    private: true   # flag: never send to external APIs

  deepseek:
    provider: deepseek
    model: deepseek-coder
    api_key_env: DEEPSEEK_API_KEY
    endpoint: https://api.deepseek.com/v1/chat/completions
    max_tokens: 8192
    temperature: 0.2
    enabled: false  # flip to true when key is set

  openai:
    provider: openai
    model: gpt-4o
    api_key_env: OPENAI_API_KEY
    endpoint: https://api.openai.com/v1/chat/completions
    max_tokens: 8192
    temperature: 0.2
    enabled: false

The router reads this at runtime. Set default to swap your primary reviewer. Set enabled: false to disable an engine without deleting its config. The private: true flag on Ollama is a guardrail — the skill will refuse to send that diff to any external endpoint even if you fat-finger the engine flag.

Want to add a brand new LLM? Add a YAML block. Done.

3. Run Your First Review

Make some changes in your project, then:

/review

The skill captures git diff HEAD (staged + unstaged), structures a review prompt, sends it to your default engine, and streams back a scored report. A real output looks like this:

cc-review | engine: gemini-2.0-flash | 2026-04-11 09:14 CST
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

SCORES
  Completeness  8/10  — Core logic is present; error branches for
                        empty API response are missing.
  Correctness   7/10  — Line 84: off-by-one in pagination cursor.
                        Will silently drop the last record.
  Quality       9/10  — Clean separation of concerns. Good.
  Security      6/10  — API key interpolated directly into log
                        string at line 112. Rotate and fix.

CRITICAL (fix before merge)
  [security] src/client.ts:112
  → `console.log(`Auth: ${apiKey}`)` logs the raw key.
    Replace with a masked version: apiKey.slice(0,4) + '****'

  [correctness] src/paginator.ts:84
  → Cursor offset is `page * limit` but should be
    `(page - 1) * limit` for 1-indexed pagination.
    Current code skips page 1 entirely.

SUGGESTIONS
  [completeness] src/client.ts:67
  → No handling for HTTP 429 (rate limit). Add exponential
    backoff or surface the error to caller.

  [quality] src/types.ts:23
  → ApiResponse<T> type is wide. Consider discriminated union
    for success/error states.

SUMMARY
  Solid implementation with two ship-blockers. The security
  issue is trivial to fix. The pagination bug would have caused
  silent data loss in production. Review cost: $0.00 (free tier).

That pagination bug? 100% something Claude wrote and Claude would have rubber-stamped. Gemini caught it because it has no attachment to the original decision.

4. Use Adversarial Mode for Critical Phases

Standard review finds bugs. Adversarial mode finds assumptions you didn’t know you were making.

/review-adversarial

The prompt instructs the external engine to play the role of a skeptical senior engineer who actively looks for: race conditions, wrong abstractions, over-engineering, security footguns, and implicit dependencies that will break in production.

Real example output from my Second Brain project’s memory indexer:

ADVERSARIAL REVIEW | engine: gemini-2.0-flash
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

CHALLENGED ASSUMPTIONS

  1. "Files are processed sequentially"
     → Your glob pattern returns files in filesystem order.
       On macOS this is usually alphabetical. On Linux ext4
       it's creation order. On network mounts, undefined.
       Your tests will pass locally and break in CI.

  2. "MEMORY.md is always writable"
     → No lock file, no atomic write. Two agents running
       concurrently will corrupt this file. You mentioned
       scheduled tasks — this WILL happen.

  3. "The embedding model is stable"
     → You hardcode 'text-embedding-3-small' but never pin
       the version. OpenAI has silently updated embeddings
       before. Your similarity scores will drift over time
       and you won't know why.

VERDICT
  Ship with fixes for #1 and #2. #3 is acceptable risk
  for a solo project but document the assumption explicitly.

None of those were in the standard review output. Adversarial mode thinks differently.

5. Keep Private Code Private with Ollama

If you’re working on proprietary code and can’t send diffs to Google or OpenAI, flip to Ollama:

/review --engine ollama

Or set default: ollama in engines.yaml for all reviews. Everything stays on your machine. The private: true config flag means the router will hard-fail rather than accidentally route to an external API.

# Verify your Ollama setup first
/review-setup --engine ollama

# Output:
# ollama: ✓ (llama3.2 loaded, 8B params, ~6GB RAM used)
# private mode: ON — external routing disabled for this engine

Quality is lower than frontier models, but for security-sensitive codebases or pure sanity checks, local Ollama at zero cost is a real option.

The Result

After integrating cc-review into my Second Brain project workflow, here’s what changed:

  • Gemini free tier handles ~1,000 reviews/day at $0.00. For a solo developer shipping phases one at a time, this is effectively unlimited.
  • 5 real bugs caught in 3 weeks that I confirmed would have reached production — 2 correctness issues, 2 security issues, 1 missing error handler.
  • Review latency: 8-14 seconds per phase diff using Gemini Flash. Fast enough to run after every significant change without breaking flow.
  • Adversarial mode changed how I think about code I generate. I now proactively consider race conditions and assumption brittleness because I’ve seen the engine surface them repeatedly.

The workflow became:

Implement phase with Claude → /review → fix blockers → /review-adversarial → fix assumptions → commit

Key Takeaway

An LLM cannot objectively review its own output. Not because the model is bad — because the training distribution, the context window, and the confirmation bias are all pointing the same direction. Independent review means a genuinely different model, with different training data, reading your code cold.

cc-review is a 20-minute setup that gives you that independence, at zero cost, with full control over which engine reviews which code and whether anything ever leaves your machine.

The irony is: the better your AI coding assistant gets, the more you need this. The faster Claude ships code, the faster bugs accumulate without a second opinion.

Star the repo if this is useful: github.com/mudavathsrinivas/cc-review

Pull requests welcome — especially new engine configs for engines.yaml. If you’ve got a working block for Mistral, Cohere, or any local model, open a PR and I’ll merge it.

Follow me here on Dev.to — I’m documenting the full AI Second Brain build in public, including the infrastructure, the failures, and the moments where the AI confidently wrote something completely wrong.

I Rewrote My Portfolio From Scratch — Here’s What Actually Changed (And Why)

My old portfolio wasn’t bad. It had all the things that feel polished when you first build them: card grids, gradient overlays, staggered animations, rounded-3xl corners everywhere. You look at it and think: yeah, that looks like a modern site.

Then six months pass and you keep opening it and something feels off. Nothing is obviously wrong. But every section is quietly competing for attention. There’s no hierarchy. It’s just noise.

So I started over on the design language, rewrote most of the components, added a couple of features I’d been putting off for too long, and somewhere in the middle of all this, switched my entire dev setup from Windows to macOS. This post is the full breakdown.

The Design Shift: From “Modern SaaS” to Editorial

The old design lived in a very specific aesthetic bucket I’d call modern SaaS: shadows on cards, borders everywhere, lots of colour, hover-lift animations. It’s a style that works great for product landing pages. For a personal portfolio it ends up feeling generic.

The new direction is closer to editorial design. Think tech publication meets printed magazine. The changes sound small individually but add up fast.

No more rounded corners. I removed rounded-3xl from basically everything. Flat, sharp edges give the layout a much more intentional, structured feel.

Left accent bars instead of cards. Instead of wrapping content in bordered card boxes, list rows now have a thin vertical bar on the left that scales in from the center on hover:

<div className="absolute left-0 inset-y-0 w-0.5 bg-gray-900 dark:bg-white origin-center scale-y-0 group-hover:scale-y-100 transition-transform duration-200 rounded-sm" />

Mono eyebrow labels. Every section heading now has a small uppercase label above it. The kind of detail you don’t consciously notice, but that makes everything feel considered:

<span className="font-mono text-[9px] tracking-[0.45em] uppercase text-gray-500">
  {eyebrow}
</span>

Section number watermarks. Big faded numbers (01, 02, 03) sit behind each section. More editorial energy, less web-app energy.

Invert-fill buttons. The hover state on CTAs now runs a fill layer up from the bottom of the button rather than just swapping the background colour:

<button className="group relative inline-flex items-center gap-2 px-7 py-3.5 border border-gray-400 overflow-hidden hover:text-white transition-colors duration-300">
  <span className="absolute inset-0 bg-gray-900 translate-y-full group-hover:translate-y-0 transition-transform duration-300 ease-[cubic-bezier(0.22,1,0.36,1)]" />
  <span className="relative z-10">View Work</span>
</button>

Border dividers instead of card wrappers. Lists are flat rows with border-b separators. The content breathes. You can actually read it.

The whole palette simplified down to gray-900 and white in dark mode, with emerald only for “currently available” indicators.

The Navbar

Two things had been bugging me about the old navbar: it added a drop shadow on scroll, and the hamburger icon was just two static SVGs swapping in and out.

The scroll behaviour now switches to a border-b instead:

// Before
navRef.current!.classList.add("shadow", "backdrop-blur-xl", "bg-white/70");

// After
navRef.current!.classList.add(
  "border-b",
  "border-gray-200",
  "dark:border-neutral-700",
  "backdrop-blur-xl",
  "bg-white/80",
  "dark:bg-darkPrimary/90"
);

The hamburger became three motion.span lines that animate into an X:

<motion.span
  animate={open ? { rotate: 45, y: 7 } : { rotate: 0, y: 0 }}
  transition={{ duration: 0.2 }}
  className="block w-5 h-px bg-gray-900 dark:bg-white origin-center"
/>
<motion.span
  animate={open ? { opacity: 0, scaleX: 0 } : { opacity: 1, scaleX: 1 }}
  className="block w-5 h-px bg-gray-900 dark:bg-white"
/>
<motion.span
  animate={open ? { rotate: -45, y: -7 } : { rotate: 0, y: 0 }}
  className="block w-5 h-px bg-gray-900 dark:bg-white origin-center"
/>

It’s a detail that takes twenty minutes to implement and immediately makes the site feel more alive.

On desktop, nav items use layoutId="nav-underline" for a shared-element animated underline that slides between links. On mobile, the menu now has large numbered links and a “MENU” watermark sitting in the background.

The Hero Section (Built From Zero)

There wasn’t really a proper hero component before, just the first section with some text in it. I built HeroSection.tsx from scratch.

The dot grid background is a single CSS radial-gradient at 12% opacity:

<div
  style={{
    backgroundImage: "radial-gradient(circle, #6b7280 1px, transparent 1px)",
    backgroundSize: "28px 28px",
    opacity: 0.12,
  }}
/>

The “JS” watermark is my initials in massive gradient-clipped text, sitting to the right of the content area. It’s not readable, it’s just a shape:

<div
  className="absolute -right-4 top-1/2 -translate-y-1/2 font-black select-none pointer-events-none bg-gradient-to-b from-gray-200 to-gray-50 dark:from-[#232628] dark:to-darkPrimary bg-clip-text text-transparent"
  style={{ fontSize: "clamp(8rem, 24vw, 22rem)" }}
>
  JS
</div>

The profile image is grayscale at rest and transitions to full colour on hover. I genuinely love this one. It’s the kind of detail that makes a visitor do a double-take the first time they see it:

<Image
  className="grayscale hover:grayscale-0 transition-all duration-500"
  src="/profile.jpg"
  alt="Jatin Sharma"
  width={300}
  height={300}
/>

Corner cross-tick marks at the four corners of the content area give that blueprint feel without being heavy-handed about it.

Blog Cards: Stripping It All Back

This was probably the most dramatic visual change on the site.

The old blog card was tall, image-dominant, and wrapped in a rounded bordered box. It looked fine. But it was heavy. Reading the blog index felt like scrolling through a gallery.

// Before: big card with image and rounded corners
<motion.article className="group bg-white dark:bg-darkSecondary rounded-3xl overflow-hidden border-2 border-gray-100">
  <div className="grid md:grid-cols-2 gap-6 p-6">
    {/* image + content */}
  </div>
</motion.article>

// After: flat row with accent bar
<motion.article className="group relative border-b border-gray-300 dark:border-neutral-700 last:border-0">
  <div className="absolute left-0 inset-y-0 w-0.5 bg-gray-900 dark:bg-white origin-center scale-y-0 group-hover:scale-y-100 transition-transform duration-200" />
  <Link className="flex items-center gap-4 py-6 pl-4 pr-2">
    {/* index + title + arrow */}
  </Link>
</motion.article>

No images. No author avatar. No “Read more” button. Just the content. The result is a blog index you can actually scan. You can read ten titles in the time it used to take to read three.

Skills Section: Marquee and Denser Grid

The skill cards went from tall centered icon boxes to compact horizontal rows in a grid-cols-2 sm:grid-cols-3 lg:grid-cols-4 grid.

The new addition I’m happiest about is the marquee ticker, a continuous horizontal scroll of all skills sitting between the section header and filter buttons:

<div className="flex animate-marquee gap-10 w-max">
  {[...skills, ...skills].map((skill, i) => {
    const Icon = skill.Icon;
    return (
      <div key={i} className="inline-flex items-center gap-2 text-gray-600 dark:text-gray-400 select-none">
        <Icon className="w-3.5 h-3.5 flex-shrink-0" />
        <span className="text-[10px] font-mono uppercase tracking-[0.25em] whitespace-nowrap">
          {skill.name}
        </span>
      </div>
    );
  })}
</div>

The trick is duplicating the array so the scroll loops seamlessly. No library needed, just a CSS @keyframes translate.

The filter also now shows a count next to each category name, and switching categories triggers a full stagger re-entry on the grid rather than trying to animate individual items in and out.

Table of Contents: Panel to Drawer

The old TOC was a fixed left panel on desktop. It worked, but it was always there, always occupying space, always creating layout tension with the article content.

The new version is a FAB button that opens a left-side drawer:

{/* FAB */}
<motion.button
  onClick={() => setOpen((o) => !o)}
  className="fixed bottom-6 left-6 z-40 flex items-center gap-2 px-3 h-9 bg-gray-900 dark:bg-white text-white dark:text-gray-900 font-mono text-[10px] tracking-[0.35em] uppercase"
>
  <BsListUl className="w-3.5 h-3.5" />
  <span className="hidden sm:inline">Contents</span>
</motion.button>

{/* Drawer */}
<motion.aside
  initial={{ x: "-100%" }}
  animate={{ x: 0 }}
  exit={{ x: "-100%" }}
  transition={{ type: "spring", stiffness: 340, damping: 32 }}
  className="fixed top-0 left-0 bottom-0 w-full sm:w-80 bg-white dark:bg-darkPrimary border-r border-gray-200 dark:border-neutral-700 flex flex-col"
>

Removing the fixed panel also let me drop four dependencies I didn’t need anymore: useScrollPercentage, useWindowSize, lockScroll, and removeScrollLock. The drawer just uses a backdrop click to close.

The Books Page

This is the biggest entirely new feature.

I’ve been tracking my reading on Hardcover and wanted to surface that data on the portfolio. Hardcover has a GraphQL API so I built a small client:

async function hardcoverQuery<T>(
  query: string,
  variables?: Record<string, unknown>,
): Promise<T> {
  const res = await fetch("https://api.hardcover.app/v1/graphql", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      authorization: `Bearer ${process.env.HARDCOVER_API_KEY}`,
    },
    body: JSON.stringify({ query, variables }),
  });
  const json = await res.json();
  return json.data as T;
}

The API route caches for 24 hours with stale-while-revalidate so it doesn’t hammer the endpoint:

res.setHeader(
  "Cache-Control",
  "public, s-maxage=86400, stale-while-revalidate=43200"
);

The page has reading stats, a year-goal progress bar that animates in on scroll, a tabbed shelf for the three reading statuses, and a debounced search across title and author. It’s the kind of page that only makes sense on a personal portfolio.

siteConfig.ts: One Source of Truth

My name, email, job title, section copy, and social links were scattered across maybe a dozen different files. When I needed to update something, even just my job title, I had to grep for it and hope I caught every instance.

Now there’s a single content/siteConfig.ts:

const siteConfig = {
  person: {
    name: "Jatin Sharma",
    email: "work.j471n@gmail.com",
    location: "Based in India",
  },
  home: {
    hero: {
      rolePrefix: "Tech Lead at",
      companyName: "KonnectNXT",
      primaryCta: { label: "Download Resume", url: "https://bit.ly/j471nCV" },
    },
  },
} as const;

socialMedia.ts and user.ts both derive from siteConfig now. One change propagates everywhere. It’s the kind of refactor you keep putting off until you finally do it and immediately wonder why you waited.

Syntax Highlighting: Light and Dark

Previously everything used one-dark-pro regardless of colour scheme. Now code blocks adapt properly:

// Before
[rehypePrettyCode, { theme: "one-dark-pro" }]

// After
[rehypePrettyCode, {
  theme: {
    dark: "one-dark-pro",
    light: "github-light",
  },
}]

For MDX local content I went with andromeeda (dark) and catppuccin-latte (light), a slightly warmer combination.

The CodeTitle component also got redesigned. The old bordered box is now a top accent line and a clean mono label:

// Before
<div className="bg-white rounded-tl-md rounded-tr-md p-3 border border-black">

// After
<div className="!mt-4 mb-[14px]">
  <div className="h-0.5 w-full bg-gray-900 dark:bg-white" />
  <div className="bg-white dark:bg-darkSecondary border border-b-0 border-gray-200 dark:border-neutral-700 px-4 py-2 flex items-center gap-2 font-mono overflow-x-auto">
    <Icon className="w-3.5 h-3.5 text-gray-400" />
    <span className="text-[10px] tracking-[0.35em] uppercase text-gray-600 dark:text-gray-400">
      {title || lang}
    </span>
  </div>
</div>

The Uses Page: Full Windows to macOS Migration

I switched from Windows to macOS this year and my /uses page was so out of date it was almost embarrassing.

Gone: Windows 11, Edge, Sublime Text, ShareX, Ditto, 7-Zip, Flameshot, Notepad++, Google Keep, Microsoft Todo.

In: macOS, Homebrew, Raycast (the thing I miss most when I’m on any other machine), Rectangle, iTerm2, Warp, Oh My Zsh, Arc as my primary browser, CleanShot X, Obsidian.

The whole category structure got reworked too: System and OS, Terminal and CLI, Development, Design and Creativity, Productivity, Browsers, Communication.

The Smaller Stuff

A reusable PageHeader component: the watermark + eyebrow + title + description pattern was copy-pasted into every page. Extracted it once, imported it everywhere.

fallback: "blocking" on the blog: changed from fallback: false so newly published posts go live immediately without requiring a full rebuild.

BlogLayout cleanup: removed the Newsletter section, share buttons, and bookmark feature from individual post pages. Just the article now.

Animation cleanup: replaced dozens of imported FramerMotionVariants with simple inline transitions. Less code, same feel, easier to read:

// Before: importing complex variants from a separate file
import { FadeContainer, popUp } from "@content/FramerMotionVariants";

// After: simple inline
<motion.div
  initial={{ opacity: 0, y: 12 }}
  animate={{ opacity: 1, y: 0 }}
  transition={{ type: "spring", stiffness: 300, damping: 24 }}
>

What’s Next

The Projects page still needs the same treatment the blog got. I’ve already plumbed in the featured prop on Project.tsx, just haven’t gotten there yet. There’s also a /stats page refresh on the list.

But the core is done and it finally feels like mine.

All the code is open source at github.com/j471n/j471n.in. If you want to lift any of the patterns, the watermark technique, the invert-fill buttons, the marquee, the accent bars, go for it.

FlutterとSupabaseでNotion風ノートコメント機能を実装した話

FlutterとSupabaseでNotion風ノートコメント機能を実装した話

はじめに

自分株式会社(https://my-web-app-b67f4.web.app/)はFlutter Web + Supabaseで構築したAI統合ライフマネジメントアプリです。Notionの機能ギャップを埋めるべく、今回「コメント機能」を実装しました。

Notionではページに対してコメントを残す機能が標準搭載されています。自分のノートに「後で確認」「ここ重要」「アイデアメモ」といった付箋代わりのコメントを追加できるあの機能です。

実装内容

1. DBスキーマ(マイグレーション)

CREATE TABLE IF NOT EXISTS note_comments (
  id         uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  note_id    bigint NOT NULL REFERENCES notes(id) ON DELETE CASCADE,
  user_id    uuid NOT NULL REFERENCES auth.users(id) ON DELETE CASCADE,
  content    text NOT NULL CHECK (length(trim(content)) > 0),
  created_at timestamptz NOT NULL DEFAULT now(),
  updated_at timestamptz NOT NULL DEFAULT now()
);

ALTER TABLE note_comments ENABLE ROW LEVEL SECURITY;

-- RLS: ユーザーは自分のコメントのみ操作可能
CREATE POLICY "Users can view own note comments"
  ON note_comments FOR SELECT USING (auth.uid() = user_id);

CREATE POLICY "Users can insert own note comments"
  ON note_comments FOR INSERT WITH CHECK (auth.uid() = user_id);

CREATE POLICY "Users can update own note comments"
  ON note_comments FOR UPDATE
  USING (auth.uid() = user_id) WITH CHECK (auth.uid() = user_id);

CREATE POLICY "Users can delete own note comments"
  ON note_comments FOR DELETE USING (auth.uid() = user_id);

CREATE INDEX IF NOT EXISTS note_comments_note_id_created_at_idx
  ON note_comments (note_id, created_at ASC);

ポイントは RLS(Row Level Security)。SupabaseのRLSを使うと、SQLレイヤーで「自分のデータしか見えない」を強制できます。フロントエンドでのフィルタリングに頼らず、DBレベルで保護されるのが安心です。

2. Supabase Edge Function (Deno)

// note-comments/index.ts
serve(async (req) => {
  const userId = getUserIdFromJwt(req);
  if (!userId) return json({ error: "unauthorized" }, 401);

  if (req.method === "GET") {
    // note_id でコメント一覧取得(ノート所有確認付き)
    const { data } = await client
      .from("note_comments")
      .select("id, content, created_at, updated_at")
      .eq("note_id", noteId)
      .eq("user_id", userId)
      .order("created_at", { ascending: true });
    return json({ comments: data ?? [] });
  }

  if (req.method === "POST") {
    // コメント追加(内容2000字制限)
    const { data } = await client
      .from("note_comments")
      .insert({ note_id: noteId, user_id: userId, content })
      .select("id, content, created_at, updated_at")
      .single();
    return json({ ok: true, comment: data });
  }

  if (req.method === "DELETE") {
    // コメント削除(自分のコメントのみ)
    await client.from("note_comments")
      .delete()
      .eq("id", commentId)
      .eq("user_id", userId);
    return json({ ok: true });
  }
});

Edge Functionではサービスロールキーを使いつつ、JWTからuser_idを取り出してすべての操作に .eq("user_id", userId) を付与しています。二重の安全策です。

3. Flutter UI

// NoteEditorPage の AppBar に追加
if (_currentNoteId != null)
  Stack(
    alignment: Alignment.topRight,
    children: [
      IconButton(
        icon: const Icon(Icons.comment_outlined),
        onPressed: _showComments,
        tooltip: 'コメント',
      ),
      if (_commentCount > 0)
        Positioned(
          top: 6, right: 6,
          child: Container(
            // バッジ表示
            child: Text('$_commentCount', ...),
          ),
        ),
    ],
  ),

コメント数をバッジ表示することで、ノートにコメントがあることを一目で把握できます。

BottomSheetでコメント一覧と入力フィールドを表示し、DraggableScrollableSheetで高さ調整可能にしています。

詰まったポイント

withOpacitywithValues への移行

Flutter 3.x では withOpacity が deprecated になっており、withValues(alpha: x) を使う必要があります。flutter analyze で検出して修正しました。

// ❌ deprecated
color: Colors.indigo.withOpacity(0.05),

// ✅ 正しい
color: Colors.indigo.withValues(alpha: 0.05),

unawaited の活用

バックグラウンドでコメント数をロードする際、unawaited() を使って非同期処理を明示的に「待たない」ことを示します。

// dart:async の unawaited で fire-and-forget
unawaited(_loadCommentCount());

まとめ

  • PostgreSQL RLS + Supabase で安全なマルチテナントデータ管理
  • Edge Function でノート所有権チェックを二重実装
  • Flutter の DraggableScrollableSheet でスムーズなBottomSheet UI
  • flutter analyze 0件 を維持しながら実装完了

次は チームワークスペース(リアルタイム共同編集)の基盤整備に取り組みます。

URL: https://my-web-app-b67f4.web.app/

FlutterWeb #Supabase #buildinpublic #Dart

Building a Local AI Assistant on Linux — Recent Progress on Echo

Building a Local AI Assistant on Linux — Recent Progress on Echo

Last week, I made significant strides in building my local AI assistant, Echo, on my Ubuntu machine. This article covers the recent updates, including how I refined my AI’s content strategy, improved my trading bots, and enhanced the session checkpoint system.

2026-04-01 — Publisher Wired to Content Strategy

I’ve been working on making my content more dynamic and relevant by integrating it with a content strategy file. Here’s how I did it:

# echo_devto_publisher.py
import json
import os

def read_content_strategy():
    with open('content_strategy.json', 'r') as f:
        return json.load(f)

def update_publisher():
    strategy = read_content_strategy()
    if strategy.get('next'):
        next_topic = strategy['next']
        # Use next_topic to set content for the publisher
    elif strategy.get('queued'):
        queued_topic = strategy['queued'][0]
        # Use queued_topic to set content for the publisher
    else:
        # Use generic content for the publisher
        pass

update_publisher()

After making this change, I reset my content queue to ensure all my topics are ready to publish. I also deleted any generic articles from March 31 to keep my feed fresh. Next Tuesday, I’ll be sharing how I built a two-way phone bridge for my AI using ntfy.sh.

2026-04-01 — Trade Brain v2

I’ve been working on my trading bots, specifically the Trade Brain, which has seen a few updates. Here are the key changes:

  • Increased Position Sizing: I’ve increased the position size to 10% per trend trade and 8% for momentum trades, with a maximum of 8 positions.
  • Added Trailing Stop: This feature protects gains after a 2% upward movement.
  • Sector Awareness: The bot now prevents over-concentration in the same sector.
  • Updated Watchlists: I’ve added XOM (energy), IWM (small cap), RKLB, and IONQ to the watchlist.
  • Fixed Take Profit: For trend trades, the take profit is set to 5%, and for momentum trades, it’s 3%.

The first v2 cycle saw the entry of XOM (energy trend) and RKLB (momentum).

2026-04-01 — Crypto Brain Live

I’ve also made progress on my Crypto Brain, a 24/7 trading bot for cryptocurrencies. Here are the details:

# core/crypto_brain.py
import alpaca_trade_api as tradeapi
import pandas as pd
import talib

API_KEY = 'your_api_key'
API_SECRET = 'your_api_secret'
BASE_URL = 'https://paper-api.alpaca.markets'

api = tradeapi.REST(API_KEY, API_SECRET, BASE_URL, api_version='v2')

def get_cryptos_data():
    assets = ['BTC/USD', 'ETH/USD', 'SOL/USD', 'AVAX/USD']
    dfs = [pd.DataFrame(api.get_bars(asset, '1H', limit=720).df) for asset in assets]
    return pd.concat(dfs, keys=assets)

def crypto_strategy(df):
    indicators = talib.RSI(df['close'], timeperiod=14)
    mean_reversion = (indicators < 30) & (df['close'].pct_change() > 0.04)
    momentum = (df['close'].pct_change() > 0.06)
    return mean_reversion & momentum

data = get_cryptos_data()
trades = data[data.apply(crypto_strategy, axis=1)]
print(trades)

This bot uses the RSI mean reversion strategy combined with a 6-hour momentum check. The take profit is set to 4%, and the stop loss is 2%. The first scan revealed that all coins were in the oversold RSI range (31-33).

2026-04-02 — Session Checkpoint Upgraded

To ensure a smooth session summary, I upgraded the session checkpoint system:

# session_checkpoint.py
import re

def collect_session_focus():
    headers = []
    with open('session_summary.json', 'r') as f:
        session_summary = json.load(f)
        if 'override_focus' in session_summary:
            return session_summary['override_focus']
        headers = [line.strip() for line in f.readlines() if re.match(r'^##s', line)]
        return ' + '.join(headers[:4])

focus = collect_session_focus()
print(focus)

This script now collects all ## headers from the session summary and filters out noise. The focus is then joined into a natural-sounding briefing by the LLM, which speaks the joined focus at 8am.

2026-04-02 — Briefing Fixed — Direct Ollama Call

Finally, I fixed the daily briefing by calling Ollama directly via HTTP:


python
# daily_briefing.py
import requests

def call_ollama():
    url = 'https://ollama.com/api/v1/brief'
    headers = {'Content-Type': 'application/json'}
    data = {

How I Built a Lightning-Fast Face Recognition Batch Processor using Python & Docker

Introduction

Have you ever tried to find a specific person in a folder containing thousands of event photos? Whether it’s a wedding, a graduation, or a corporate event, photographers spend hours manually sifting through galleries to deliver personalized photo sets to their clients.

I wanted to automate this, but I quickly ran into a wall: performing deep learning face recognition on thousands of high-resolution images is computationally expensive and memory-hungry.

So, I built Py_Faces, a batch face recognition system that solves this by separating the heavy lifting from the actual search process. Here is how I designed the architecture to search through thousands of photos in seconds, and how I tamed Docker memory limits along the way.

The Architecture: Calculate Once, Search Instantly
The biggest mistake I could have made was scanning the entire photo folder every time the user wanted to search for a new person. Instead, I split the system into three completely independent steps:

  • Step 1: The Heavy Lifting (Encoding Extraction)
    The first script (escaner_encodings.py) scans every photo in the batch just once. It detects the faces, applies a CLAHE filter (Contrast Limited Adaptive Histogram Equalization) to handle bad lighting, and extracts a 128-dimension facial encoding vector using the face_recognition library.

These vectors—along with metadata and file paths—are saved into a binary .pkl file. This process can take around an hour depending on the CPU, but it’s a one-time cost.

  • Step 2: Defining the Target
    When the user wants to find someone, they drop a few clear photos of that person into a persona_objetivo folder. The second script (definir_objetivo.py) extracts the encodings from these reference photos and averages them out to create a highly accurate “Target Profile”.

  • Step 3: The Lightning-Fast Search
    Here is where the magic happens. The third script (buscador_objetivo.py) doesn’t look at images at all. It simply loads the massive .pkl file from Step 1 and uses NumPy to calculate the Euclidean distance between the “Target Profile” and every face in the batch.

Because it’s just comparing arrays of numbers, searching through thousands of photos takes about 2 seconds. The script then automatically copies the matching photos into a new folder and generates a detailed Excel report using pandas.

Taming the Docker & Memory Beast
To make this tool accessible, I wrapped it in a Docker container (python:3.11-slim). This avoids the nightmare of making users install C++ build tools, CMake, and dlib natively on Windows.

However, this introduced a massive challenge: Memory Management.
Docker Desktop on Windows (WSL2) limits memory usage. Processing high-res images with HOG or CNN models in parallel quickly leads to BrokenExecutor crashes because the container runs out of RAM.

To fix this, I implemented a dynamic worker calculation function that checks the actual available RAM inside the Linux container (/proc/meminfo) before launching the ProcessPoolExecutor:

`def calcular_workers():
    """Estimates safe workers based on free memory in the container."""
    import os
    memoria_por_worker_gb = 1.2  # Estimated RAM per parallel process
    try:
        if os.path.exists('/proc/meminfo'):
            with open('/proc/meminfo') as f:
                for linea in f:
                    if linea.startswith('MemAvailable:'):
                        mem_kb = int(linea.split()[1])
                        mem_gb = mem_kb / (1024 * 1024)
                        # Reserve at least 0.8 GB for the OS and main orchestrator
                        mem_disponible_gb = max(0.5, mem_gb - 0.8)
                        workers_por_ram = int(mem_disponible_gb / memoria_por_worker_gb)
                        return max(1, min(workers_por_ram, os.cpu_count() or 4))
    except Exception:
        pass

    # Safe fallback
    cpus = os.cpu_count() or 2
    return max(1, int(cpus / 2))`

This ensures the script scales perfectly. If you run it on a 16GB machine, it maximizes the workers; if you run it on a constrained Docker environment, it dials it back to prevent crashes. Furthermore, images are resized (e.g., max 1800px or 2400px width) before processing to keep memory spikes in check.

Dealing with Real-World Dirty Data
When dealing with raw client photos, you learn quickly that data is never clean. I had to implement several fallbacks:

EXIF Orientations: Photos taken vertically often appear horizontal to dlib. I wrote a utility using Pillow (PIL) to read EXIF tags and physically rotate the arrays before detection.

Sequential Retries: If the multiprocessing pool does crash, the script catches the BrokenExecutor error, rescues the failed batch, and processes them sequentially so the user doesn’t lose an hour of progress.

Conclusion
Building Py_Faces taught me that sometimes the best way to optimize a slow process isn’t to write faster algorithms, but to change the architecture entirely. By decoupling the extraction from the comparison, a heavy machine-learning task became an instant search tool.

You can check out the full code on my GitHub: https://github.com/daws-4/pyfaces

Have you ever dealt with memory leaks or dlib crashes in Docker? I’d love to hear how you solved them in the comments!

How To Improve UX In Legacy Systems

Imagine that you need to improve the UX of a legacy system. A system that has been silently working in the background for almost a decade. It’s slow, half-broken, unreliable, and severely outdated — a sort of “black box” that everyone relies upon, but nobody really knows what’s happening under the hood.

Where would you even start? Legacy stories are often daunting, adventurous, and utterly confusing. They represent a mixture of fast-paced decisions, quick fixes, and accumulating UX debt.

There is no one-fits-all solution to tackle them, but there are ways to make progress, albeit slowly, while respecting the needs and concerns of users and stakeholders. Now, let’s see how we can do just that.

The Actual Challenges Of Legacy UX

It might feel that legacy products are waiting to be deprecated at any moment. But in reality, they are often critical for daily operations. Many legacy systems are heavily customized for the needs of the organization, often built externally by a supplier and often without rigorous usability testing.

It’s common for enterprises to spend 40–60% of their time managing, maintaining, and fine-tuning legacy systems. They are essential, critical — but also very expensive to keep alive.

1. Legacy Must Co-Exist With Products Built Around Them

Running in a broken, decade-old ecosystem, legacy still works, yet nobody knows exactly how and why it still does. People who have set it up originally probably have left the company years ago, leaving a lot of unknowns and poorly documented work behind.

With them come fragmented and inconsistent design choices, stuck in old versions of old design tools that have long been discontinued.

Still, legacy systems must neatly co-exist within modern digital products built around them. In many ways, the end result resembles a Frankenstein — many bits and pieces glued together, often a mixture of modern UIs and painfully slow and barely usable fragments here and there — especially when it comes to validation, error messages, or processing data.

2. Legacy Systems Make or Break UX

Once you sprinkle a little bit of quick bugfixing, unresolved business logic issues, and unresponsive layouts, you have a truly frustrating experience, despite the enormous effort put into the rest of the application.

If one single step in a complex user flow feels utterly broken and confusing, then the entire product appears to be broken as well, despite the incredible efforts the design teams have put together in the rest of the product.

Well, eventually, you’ll have to tackle legacy. And that’s where we need to consider available options for your UX roadmap.

UX Roadmap For Tackling Legacy Projects

Don’t Dismiss Legacy: Build on Existing Knowledge

Because legacy systems are often big unknowns that cause a lot of frustration to everyone, from stakeholders to designers to engineers to users. The initial thought might be to remove it entirely and redesign it from scratch, but in practice, that’s not always feasible. Big-bang-redesign is a remarkably expensive and very time-consuming endeavor.

Legacy systems hold valuable knowledge about the business practice, and they do work — and a new system must perfectly match years of knowledge and customization done behind the scenes. That’s why stakeholders and users (in B2B) are typically heavily attached to legacy systems, despite all their well-known drawbacks and pains.

To most people, because such systems are at the very heart of the business, operating on them seems to be extremely risky and will require a significant amount of caution and preparation. Corporate users don’t want big risks. So instead of dismissing legacy entirely, we might start by gathering existing knowledge first.

Map Existing Workflows and Dependencies

The best place to start is to understand how and where exactly legacy systems are in use. You might discover that some bits of the legacy systems are used all over the place — not only in your product, but also in business dashboards, by external agencies, and by other companies that integrate your product into their services.

Very often, legacy systems have dependencies on their own, integrating other legacy systems that might be much older and in a much worse state. Chances are high that you might not even consider them in the big-bang redesign — mostly because you don’t know just how many black boxes are in there.

Set up a board to document current workflows and dependencies to get a better idea of how everything works together. Include stakeholders, and involve heavy users in the conversation. You won’t be able to open the black box, but you can still shed some light on it from the perspectives of different people who may be relying on legacy for their work.

Once you’ve done that, set up a meeting to reflect to users and stakeholders what you have discovered. You will need to build confidence and trust that you aren’t missing anything important, and you need to visualize the dependencies that a legacy tool has to everyone involved.

Replacing a legacy system is never about legacy alone. It’s about the dependencies and workflows that rely on it, too.

Choose Your UX Migration Strategy

Once you have a big picture in front of you, you need to decide on what to do next. Big-bang relaunch or a small upgrade? Which approach would work best? You might consider the following options before you decide on how to proceed:

  • Big-bang relaunch.
    Sometimes the only available option, but it’s very risky, expensive, and can take years, without any improvements to the existing setup in the meantime.
  • Incremental migration.
    Slowly retire pieces of legacy by replacing small bits with new designs. This offers quicker wins in a Frankenstein style but can make the system unstable.
  • Parallel migration.
    Run a public beta of the replacement alongside the legacy system to involve users in shaping the new design. Retire the old system when the new one is stable, but be prepared for the cost of maintaining both.
  • Incremental parallel migration.
    List all business requirements the legacy system fulfills, then build a new product to meet them reliably, matching the old system from day one. Test early with power users, possibly offering an option to switch systems until the old one is fully retired.
  • Legacy UI upgrade + public beta.
    Perform low-risk fine-tuning on the legacy system to align UX, while incrementally building a new system with a public beta. This yields quicker and long-term wins, ideal for fast results.

Replacing a system that has been carefully refined and heavily customized for a decade is a monolithic task. You can’t just rebuild something from scratch within a few weeks that others have been working on for years.

So whenever possible, try to increment gradually, involving users and stakeholders and engineers along the way — and with enough buffer time and continuous feedback loops.

Wrapping Up

With legacy projects, failure is often not an option. You’re migrating not just components, but users and workflows. Because you operate on the very heart of the business, expect a lot of attention, skepticism, doubts, fears, and concerns. So build strong relationships with key stakeholders and key users and share ownership with them. You will need their support and their buy-in to bring your UX work in action.

Stakeholders will request old and new features. They will focus on edge cases, exceptions, and tiny tasks. They will question your decisions. They will send mixed signals and change their opinions. And they will expect the new system to run flawlessly from day one.

And the best thing you can do is to work with them throughout the entire design process, right from the very beginning. Run a successful pilot project to build trust. Report your progress repeatedly. And account for intense phases of rigorous testing with legacy users.

Revamping a legacy system is a tough challenge. But there is rarely any project that can have so much impact on such a scale. Roll up your sleeves and get through it successfully, and your team will be remembered, respected, and rewarded for years to come.

Meet “Measure UX & Design Impact”

Meet Measure UX & Design Impact, Vitaly’s practical guide for designers and UX leads on how to track and visualize the incredible impact of your UX work on business — with a live UX training later this year. Jump to details.


Meet Measure UX and Design Impact, a practical video course for designers and UX leads.

  • Video + UX Training
  • Video only

Video + UX Training

$ 495.00 $ 799.00

Get Video + UX Training

25 video lessons (8h) + Live UX Training.
100 days money-back-guarantee.

Video only

$ 250.00$ 350.00

Get the video course

25 video lessons (8h). Updated yearly.
Also available as a UX Bundle with 3 video courses.

Useful Resources

  • UX Migration Strategy For Legacy Apps, by Tamara Chehayeb Makarem
  • How To Improve Legacy Systems, by Christopher Wong
  • Designing With Legacy, by Peter Zalman
  • Redesigning A Large Legacy System, by Pawel Halicki
  • How To Manage Legacy Code, by Nicolas Carlo
  • How To Transform Legacy, by Bansi Mehta
  • Design Debt 101, by Alicja Suska
  • Practical Guide To Enterprise UX, by Yours Truly
  • Healthcare UX Design Playbook, by Yours Truly

Citation Needed: Structured data extraction workflows

In the previous article we explored how to generate and use structured data in a workflow. Now, let’s take it a step further.

We’ll build a workflow that checks whether an article provides evidence to support its claims (but not whether the evidence itself is valid). Rather than using this to fact check articles in the wild, this might be useful for critiquing your own writing before submission or checking generated text for hallucinations.

This task is impractical to automate without generative language models. Natural language processing pipelines might be able to extract or categorize entities and phrases from a text, but this task requires a degree of reading comprehension not available without larger language models.

Furthermore, while many language models are capable of performing individual steps, the overall process requires more rigor and discipline than they are trained for. Frontier models might handle moderately complex tasks, but verifying that they haven’t hallucinated the results requires additional work on par with this workflow.

What we can do instead is split the task into distinct steps: extracting claims then checking each of them. In this article we’ll look into the first part using our old friend the LLM › Structured node.

Claims Schema

In the Structured Generation tutorial we saw how to generate a single structured entry from scratch. LLMs are capable of handling much more complexity. This time we will ask the model to determine which phrases in a text are factual claims and place them into a list. Furthermore, we ask the model to rank the importance of each claim, holistically, when deciding whether to include it.

structured Like before, create a new workflow and swap out the normal Chat for a Structured node.

Create a Parse JSON node and connect it to the schema input of the Structured node. Fill it with this schema conveniently generated by an LLM:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "ClaimsList",
  "type": "object",
  "properties": {
    "claims": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "minItems": 1,
      "maxItems": 5,
      "description": "A list of claim strings. The list must contain at least one and at most five items."
    }
  },
  "required": [
    "claims"
  ],
  "additionalProperties": false
}

📢 important
Technically, an array at the top-level would be a valid schema.

However, many models have trouble generating data with that format. To ensure compatibility between providers, wrap the array in an object. Then extract the list later using JSON transformations.

Instructions

In the previous example we combined instructions with dynamic data into the prompt. This time we’ll reserve the system message for instructions and inject the data in a separate step.

instructions

By partitioning the instructions and the data it becomes much easier to reuse the workflow on new inputs. We can use the system message field of the Agent node for instructions:

Follow these instructions exactly.
Do not respond directly to the user.
Do not hallucinate the final answer.

## Instructions

Extract the key factual claims in the user's statement and format them into a list (5 items or fewer).
Ensure that each claim can stand alone without additional context to make sense of it.

💡 tip
You should experiment with variations on the instructions, particularly the preamble to optimize it for your preferred language model. I find this combination effective with the nemotron family and various other open models.

The system message is sent once at the beginning of each request. Theoretically, the LLM should pay special attention to it. Regardless, this avoids sending repeat instructions with every prompt of a conversation, even when the entire conversation is sent with every request. 1

Input Document

The input document for a workflow will typically be supplied by the runner. While developing a workflow, however, it’s convenient to create a node for a predefined text to take advantage of iterative execution. In the final version of the workflow we can delete this node and connect to the input of the Start node.

input doc

Create a Value › Plain Text node to hold the article content.

Connect it to the prompt input of the Structured node.

Paste the contents of an article into the text field. I’m using a Wikipedia article about apiaries (artificial beehives).

Claim Checking

We now have a workflow that generates a list of claims from a text. Our eventual goal is to have each claim checked individually against the original text, which will be supplied to the language model in a context document.

However, before learning how to check every item, we should first explore how to check a single item.

list indexing

First, let’s pull a single claim out of the structured generation using JSON › Transform JSON. This node uses a jq filter to manipulate JSON.

The filter .claims[1] tells it to access the “claims” field and return the second element (0-indexed).

💡 tip
Ask your favorite frontier LLM for help writing jq filters from sample data.

Add a second Agent node with these instructions:

Follow these instructions exactly.
Do not respond directly to the user.
Do not hallucinate the final answer.

## Instructions

Help the user analyze the article in the context file.
The user is examining individual claims that the article makes.

Determine whether the context provides supporting evidence for the claim stated by the user.
List the reference or citation provided by the article.

DO NOT interpret the article as evidence for a claim made by the user.
The user is simply examining a claim made by the article.

context documents

How can we provide the article as context for the LLM? There are several ways:

  • Inject it into the system message using templating
  • Provide it as a user message in the conversation
  • Use a LLM › Context node

The third option is cleanest since it provides a clear demarcation between instructions, context and prompt. The Context node sits between the agent and a chat node, augmenting the agent by injecting its contents into requests made by the agent.

Connect the Plain Text node containing the article to the context input. In the final version of the workflow, this should be connected to the input pin of the Start node.

unstructured check

We can use a simple Chat node to do a quick spot check on how the context affects the language model response. However, to facilitate checking the entire collection, the responses for each item should be structured.

Structured Check

Replace the Chat node with a Structured node, connecting it to the Context and Transform nodes.

Use this schema for the claims checking Structured node:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "description": "A factual claim with evidence from citations or references",
  "type": "object",
  "required": [
    "claim",
    "grounding"
  ],
  "properties": {
    "claim": {
      "type": "string",
      "description": "the original claim made by the article"
    },
    "grounding": {
      "enum": [
        "not a claim",
        "unsupported",
        "fully supported"
      ],
      "description": "The level of support for the claim provided by citations and references. If the provided text is actually a definition or something other than a claim, then "not a claim""
    },
    "evidence": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "The citations and references that support the claim. Empty if the claim is not supported."
    }
  }
}

structured check

Connect the unwrapped claim to the prompt and run.

By changing the claim index we can see how it handles different claims and statements.

Conclusion

In this tutorial we’ve explored using language models to extract structured data from plain text, then transforming data for further processing. The workflow is still incomplete since we’ve only checked one claim.

Before we can go any further, we’ll need to learn about iterating over lists using subgraphs. This will allow us to check every claim individually, then draw a conclusion by combining all results.

  1. Some LLM providers support caching portions of the request. However, since this behavior isn’t standardized across providers yet, aerie does not support it. ↩