Implemented data serialization
The game engine can now save the current model, meshes, color, etc and it can also load the save data and render the model, meshes, color, etc into the game engine
Built Log Stripper: A VS Code Extension to Remove Debug Logs Across 23+ Languages
Every developer has a version of this story.
You’re about to open a pull request. The code works. The tests pass. Everything looks good.
Then you do one final scan and find this:
console.log("HERE");
console.log("user:", user);
console.log("final:", result);
And then, before the PR:
Search → Delete. Search → Delete. Search → Delete.
I got tired of it. So I built a tool to automate it.
What is Log Stripper?
Log Stripper is a VS Code extension that removes debug, log, and print statements from 23+ programming languages – with a preview so you always see what will be deleted before anything changes.

🔗 VS Code Marketplace
🔗 GitHub
The Features
🔍 Preview mode – Ctrl+Shift+D
Shows every line that will be removed. You confirm. Only then does anything change.
🎨 Highlight mode – Ctrl+Shift+H
Marks debug lines in red. No file modification. Review first, strip later.
🌍 23+ languages
JS, TS, Python, Java, Go, Rust, C#, Swift, Kotlin, Dart, Ruby, PHP, C, C++, Shell, Lua, Scala, Elixir, Haskell, R, Vue, Svelte, JSX/TSX
🏢 Workspace cleanup
Strip debug statements from an entire project in one command.
Why not just use grep/sed?
You could. But:
- No preview
- No multiline support
- No VS Code integration
- No safety rules
- Works differently on Windows vs Mac vs Linux
Log Stripper handles all of this inside VS Code with a consistent UX.
The Interesting Technical Part: Multiline Removal
Removing a single-line console.log("x") is trivial. The interesting case is:
console.log(
"user:",
JSON.stringify(user, null, 2),
"role:",
user.role
);
To remove this correctly, you need to:
- Find the opening
( - Track paren depth across lines
- Handle parens inside strings (don’t count those)
- Find the exact closing
)and optional; - Remove the entire block without touching surrounding code
Here’s the core ofskipParenBlock:
function skipParenBlock(lines: string[], startLine: number) {
const openIdx = lines[startLine].indexOf("(");
let bal = 1, li = startLine, ci = openIdx + 1;
let inStr: string | null = null;
while (li < lines.length) {
const s = lines[li];
while (ci < s.length) {
const ch = s[ci];
if (inStr) {
if (ch === "\") { ci += 2; continue; }
if (ch === inStr) inStr = null;
} else if (ch === '"' || ch === "'" || ch === "`") {
inStr = ch;
} else if (ch === "(") {
bal++;
} else if (ch === ")") {
bal--;
if (bal === 0) {
ci++;
if (s[ci] === ";") ci++;
return { endLine: li, endCol: ci };
}
}
ci++;
}
li++; ci = 0;
}
}
Architecture Decision
I separated the core logic from the VS Code layer:
src/
├── extension.ts ← VS Code commands, UI, workspace
└── stripper.ts ← Pure logic, zero VS Code deps
Why? Because stripper.ts can be tested without launching a VS Code host:
node --test out-test/test/stripper.test.js
151 tests. Fast. No mocking needed.
Safety Rules
The extension will never remove:
// console.log("commented") ← comment line, always kept
return console.log(x); ← inline code, always kept
const x = doThing() || console.log("fallback"); ← inline, kept
Only whole-statement debug calls that are the entire line (after indentation) are removed.
The Languages Covered
| Language | Examples |
|---|---|
| JS/TS |
console.log, debugger
|
| Python |
print(), logging.debug(), breakpoint()
|
| Java |
System.out.println, logger.debug()
|
| Go |
fmt.Println, log.Fatal
|
| Rust |
println!, dbg!, eprintln!
|
| C# |
Console.WriteLine, Debug.Write
|
| Swift |
print(), NSLog()
|
| Ruby |
puts, binding.pry
|
| PHP |
var_dump(), dd()
|
| Shell |
echo, printf
|
| …13 more | – |
Try It
# Install from Marketplace
code --install-extension saurabhchoudhary.log-stripper
Or search “Log Stripper” in the VS Code extensions panel.
Open a file with debug statements → Ctrl+Shift+D → see the preview → confirm.
-
Toggle Debug Highlights | Highlight/unhighlight debug lines
-
Preview & Strip Current File | Shows preview, then strips on confirm
-
Command Palette | Strip Entire Workspace | Clean whole project
-
Right-click → menu | Strip Current File | Strip without preview
-
Editor toolbar icons (eye / eye-closed / trash)
Built this because I couldn’t find a tool that did all of this properly. Turned out to be one of the most satisfying weekend projects I’ve done.
If you try it, I’d love feedback in the comments or as a GitHub issue. 🙌
By Saurabh Choudhary – Software Engineer
Introduction
Every developer has a version of this story.
You’re about to open a pull request. The code works. The tests pass. You’re ready. Then you do one last scroll through the file and see it:
console.log("HERE");
console.log("user data:", JSON.stringify(userData));
console.log("response", res);
Three debug logs you forgot to remove. You delete them, re-run the linter, and open the PR.
Two weeks later, someone reports that production logs are noisy with debug output. A developer on your team had left a logger.debug() in a hot path. Nobody caught it in review.
This isn’t a rare occurrence. It’s a pattern that plays out in every team, on every codebase, in every programming language.
I’ve experienced it across JavaScript, TypeScript, PHP, Go, Python, Vue.js, NestJS, Laravel, and Angular projects. Different languages, different ecosystems – same repetitive problem.
Eventually I stopped accepting it as “just part of the workflow” and decided to build a proper solution.
That solution is Log Stripper – a VS Code extension that removes debug, log, and print statements from 23+ programming languages with a safety-first, preview-driven approach.
The Problem
The problem has two dimensions that are easy to underestimate.
First: it’s repetitive. Before every commit, before every PR, before every release, developers go through the same manual process of searching for and deleting debug statements. On large codebases with dozens of files touched in a single feature branch, this becomes genuinely time-consuming.
Second: it’s error-prone. Humans miss things. A console.log inside a nested callback, a print() inside a conditional branch, a System.out.println buried in a Java service – these are easy to overlook when scanning files manually. And when they slip through, they end up in production.
There’s also a subtler issue: developers working under time pressure will rush the cleanup. The more pressure, the more likely something gets missed.
This is exactly the kind of task that should be automated. It is deterministic, repetitive, and rule-based. A computer should do it.
Why Existing Workflows Were Frustrating
Before building Log Stripper, I searched for existing solutions.
I found a few extensions that handled console.log removal for JavaScript. Some worked well for that specific case. But they had significant limitations:
Language coverage was narrow. Most extensions focused exclusively on JavaScript. If you wrote Python, Go, Java, or Rust, you were on your own.
No preview mode. Several extensions deleted statements immediately without showing you what would be removed. That’s unsafe. A developer who accidentally removes a legitimate logger call in production code will stop trusting the tool immediately.
No highlight mode. Sometimes I want to see the debug statements in a file without removing anything yet. I want to review them, understand them, decide manually. None of the tools I found offered non-destructive inspection.
No workspace cleanup. File-by-file cleanup doesn’t scale. Before a major release, I want to scan an entire project and clean everything in one pass.
Unreliable multiline handling. A console.log that spans multiple lines is common:
console.log(
"processing user:",
JSON.stringify(user, null, 2)
);
Many tools would only remove the first line and leave the rest, creating syntax errors.
Why I Built Log Stripper
The decision to build rather than adapt came down to one realization: the problem deserved a proper solution, not a workaround.
I wanted an extension that:
- Worked across every major language I use day-to-day
- Showed a preview before making any changes
- Could highlight debug lines without modifying files
- Could clean an entire workspace safely
- Was smart enough to never break code it shouldn’t touch
- Was backed by automated tests
None of those requirements are unreasonable. Together, they’re the minimum bar for a tool I’d actually trust.
The weekend I started building it, I intended it to be a one-day project. It grew into something more substantial as I realized the edge cases, the safety requirements, and the value of doing it properly.
Architecture
The most important architectural decision I made was separating the core logic from the VS Code integration layer.
The repository has two source files:
src/
├── extension.ts ← VS Code commands, UI, workspace operations
└── stripper.ts ← Core strip logic (zero VS Code dependencies)
stripper.ts has no awareness of VS Code. It takes a string of text and a language ID, and returns a new string with debug statements removed plus metadata about what was changed. That’s it.
This separation has two major benefits:
Testability. I can test the core logic without launching a VS Code Extension Development Host. The 151 automated tests run with a plain node --test command. Fast, simple, reliable.
Maintainability. If VS Code changes its API, or if I want to port the logic to a CLI tool or a git pre-commit hook, the core logic doesn’t need to change.
The VS Code layer handles everything user-facing: commands, keybindings, settings, decorations, progress notifications, and workspace file scanning.
Multi-Language Support
Supporting 23 languages sounds ambitious. The implementation is actually straightforward once you have the right architecture.
Each language is a key in the LANGUAGE_PATTERNS object, and its value is an array of regular expressions:
const LANGUAGE_PATTERNS: Record<string, RegExp[]> = {
javascript: [
/^(s*)console.(log|debug|info|warn|error|trace|...)s*(/,
/^(s*)debuggers*;?$/,
],
python: [
/^(s*)prints*(/,
/^(s*)logging.(debug|info|warning|error|critical)s*(/,
/^(s*)breakpoints*(s*)/,
],
// ... 21 more languages
};
Language aliases handle cases where VS Code’s language ID differs from the pattern group:
const LANG_ALIASES: Record<string, string> = {
svelte: "javascript",
"vue-html": "vue",
typescriptreact: "typescript",
};
When the extension receives a file, it looks up the language ID, finds the pattern list, and applies each pattern against every line. If no patterns are found for a language, the file is returned unchanged.
Adding a new language is a matter of adding one entry to LANGUAGE_PATTERNS and a set of test cases to stripper.test.ts. The infrastructure handles the rest.
Preview Mode
The preview mode was the feature I was most deliberate about designing.
When a developer triggers “Preview & Strip” with Ctrl+Shift+D, the extension:
- Runs the stripping logic internally (without modifying the file)
- Collects the list of lines that would be removed
- Shows a modal dialog listing those lines with their line numbers
- Waits for explicit confirmation before making any changes
This turns a potentially destructive operation into a fully transparent, opt-in action. The developer sees exactly what will happen and retains full control.
The modal includes the first 20 matching lines and appends “…and N more” if there are additional matches. This keeps the dialog readable on files with heavy debug coverage.
If the developer cancels, nothing changes. Not a single character in the file is modified.
Highlight Mode
Highlight mode solves a different need: I want to see the debug statements but I’m not ready to remove them yet.
When the developer presses Ctrl+Shift+H, the extension:
- Finds all lines matching debug patterns using
findDebugLineIndices() - Applies a VS Code text decoration that highlights those lines in red
- Adds an inline annotation “← debug” at the end of each line
The file is not modified. This is purely a visual overlay.
Pressing Ctrl+Shift+H again (or clicking the eye-closed icon in the toolbar) removes the highlights.
The highlight controller also handles three lifecycle events automatically:
- When you strip the file, highlights are cleared (no red lines on lines that no longer exist)
- When you edit the file and
refreshHighlightsOnEditis enabled, highlights recompute after a 120ms debounce
– When you switch tabs and clearHighlightsOnTabChange is enabled, highlights are cleared automatically
Workspace Cleanup
The workspace strip command scans every file in the project matching the supported language extensions and strips debug statements from each one.
The implementation uses VS Code’s workspace.findFiles() API with exclusion patterns to skip node_modules, dist, .git, vendor, build, and out by default. All exclusion patterns are configurable in settings.
The operation runs with a progress notification showing the current file being processed and a cancellation option. After completion, a summary shows how many statements were removed across how many files.
Because this modifies files on disk, the documentation explicitly recommends using version control. A git diff after the workspace strip is always informative.
Safety Design
Safety was a first-class concern throughout the design.
The extension enforces several rules to avoid breaking code it shouldn’t touch:
Rule 1: Whole-line only. A debug call is only removed if it is the entire statement on the line (after indentation). This prevents removal of:
return console.log(x); // kept - inline with return
const result = doWork() || console.log("fallback"); // kept - chained
Rule 2: Comments are never touched. Any line where the first non-whitespace characters form a comment prefix (//, #, --, /*, *) is skipped entirely.
Rule 3: Multiline paren balancing. When a debug call spans multiple lines, the extension tracks open and close parentheses (accounting for parens inside strings) to identify the exact end of the statement. The closing ) and optional ; are consumed, and any remaining code on that final line is preserved.
Rule 4: No modification on cancel. Preview mode never touches the file unless the user explicitly confirms. The stripping logic runs on an in-memory copy of the text, and the result is only written back if the user says yes.
Testing Strategy
I wanted the extension to be trustworthy. That required a proper test suite.
The tests live in test/stripper.test.ts and use Node’s built-in test runner – no Jest, no Mocha, no additional dependencies. The test command is:
tsc -p tsconfig.test.json && node --test out-test/test/stripper.test.js
The 151 test cases cover:
- Every language in
LANGUAGE_PATTERNS - Simple single-line removal
- Multiline call removal with paren balancing
- Comment preservation
- Inline code preservation
- Real-world code samples (NestJS controllers, Python classes, Go functions)
- Language aliases (Vue, Svelte, JSX, TSX)
- Unknown languages (should be a no-op)
- The
findDebugLineIndices()function (used by highlight mode)
Each test specifies the input code, the expected output, and the expected removal count. Any test that would cause the extension to modify code it shouldn’t – or fail to modify code it should – is a failing test.
Writing tests before polishing features was the right decision. It caught several edge cases during development and gave me confidence when refactoring the paren-balancing logic.
Publishing to the VS Code Marketplace
The Marketplace publishing process is simpler than I expected.
The key steps:
- Create a publisher account at
marketplace.visualstudio.com - Generate a Personal Access Token from Azure DevOps with Marketplace → Manage scope
- Install
@vscode/vsceglobally:npm install -g @vscode/vsce - Run
vsce packageto create the.vsixfile - Run
vsce publish --pat TOKENor upload the.vsixmanually via the publisher portal
The.vscodeignorefile controls what goes into the package. I excludedsrc/,test/, andnode_modules/– only the compiledout/directory ships. This keeps the extension lightweight.
One thing I’d emphasize: set the repository.url in package.json before publishing. It shows up prominently on the Marketplace listing and signals to users that the extension is open source and maintainable.
Lessons Learned
Separate your core logic from your integration layer. This applies far beyond VS Code extensions. Any time you can isolate the testable business logic from the framework or platform code, do it. It makes everything easier.
Safety features build trust. The preview mode and the “never remove inline code” rule weren’t technically necessary for the extension to work. But they’re what make a developer comfortable trusting the tool with real production code.
Test the edge cases first. Multiline calls, strings containing parens, empty files, unsupported languages – these were the cases most likely to cause silent failures. Writing tests for them before the happy path forced me to build robust logic from the start.
Ship before it’s perfect. The first version didn’t have highlight mode. The workspace strip was added later. Shipping v1 with the core feature – preview and strip – got real feedback faster than spending another two weeks on features users might not need.
Small tools have real value. This isn’t a SaaS platform. It’s not an AI product. It’s a focused tool that does one thing well. Developer tools like this get used quietly but consistently. That’s enough.
Future Improvements
Several improvements are on the roadmap:
Custom pattern rules. Allow developers to define their own patterns to remove. For teams using custom logging frameworks, this would make Log Stripper work with any codebase.
Pre-commit hook integration. A script or CLI wrapper that can run as a git pre-commit hook, catching debug statements before they ever reach a commit.
Team-wide configuration. A .logstripper.json configuration file that defines excluded patterns per project, shareable across a team through version control.
Ignore comments. A // log-stripper-ignore annotation that tells the extension to skip a specific line or block.
More language coverage. Zig, Nim, OCaml, and Erlang are candidates for future additions.
Conclusion
Log Stripper is a small tool. It solves one problem, and it solves it well.
But the process of building it taught me something more valuable than any individual feature: the instinct to look at a repetitive problem and ask “why am I doing this manually?” is one of the most productive instincts an engineer can develop.
We accept small frictions as part of the job. We work around limitations instead of fixing them. We solve the same problem repeatedly instead of automating it once.
Log Stripper exists because I stopped accepting one of those frictions.
If you’re reading this and you have something similar – a small repetitive task, an annoying workflow step, a gap in your tooling – build the solution. The process will teach you more than you expect, and the tool will serve you for years.
Try Log Stripper:
VS Code Marketplace: https://marketplace.visualstudio.com/items?itemName=saurabhchoudhary.log-stripper
GitHub: https://github.com/saurabhzaiswal/Log-Stripper-VS-Code
I’m actively improving Log Stripper, so real-world feedback is incredibly valuable.
If Log Stripper helps keep a few forgotten debug statements out of production, then it has already paid for itself.
Your Team Has 5 CLAUDE.md Files and They All Say Different Things
Your Team Has 5 CLAUDE.md Files and They All Say Different Things
You have a team. Everyone uses Claude Code. Everyone wrote their own CLAUDE.md.
Alice wrote: “Never push to production without confirmation.”
Bob wrote: “Always ask before running tests.”
Carlos wrote: “Explain changes before executing.”
Diana wrote: “Use TypeScript strict mode on all new files.”
Elena wrote nothing. She is winging it.
Five developers. Five different agents. No shared behavior. No baseline. No consistency.
And nobody noticed — because each person only sees their own agent.
Why This Happens
CLAUDE.md is not a global config file. It is a per-project, per-developer document that each agent reads at session start and interprets independently.
There is no registry. No enforcement layer. No way to know whether your team’s five CLAUDE.md files agree on anything.
This is fine when you work alone. It becomes a serious problem the moment your codebase is shared.
What Inconsistent CLAUDE.md Files Look Like in Practice
Scenario 1: Merge conflict in behavior, not in code
Alice and Bob both work on the same feature. Alice’s agent asks for confirmation before writing tests. Bob’s agent runs them automatically. The PR reviews look different. The commit histories look different. The behavior of the codebase under AI assistance is inconsistent.
Nobody merges a CLAUDE.md conflict. They just live with it.
Scenario 2: The missing rule gap
Your team decides: no any types in TypeScript. Alice adds it to her CLAUDE.md. Bob does not. Bob’s agent keeps suggesting any. Bob thinks Claude Code is broken. It is not. His CLAUDE.md just never got the memo.
Scenario 3: The rule that traveled
Your company started with two rules. You are now at twelve. Alice has twelve. Bob has eight. Carlos has five. Nobody knows which three caused the regressions last sprint.
The Real Cost
Inconsistent CLAUDE.md files mean:
- Your AI agent enforces different standards depending on who runs it
- Bugs introduced in one developer’s session would not have happened in another developer’s session
- You cannot onboard a new developer with a reliable baseline
- You cannot diagnose whether a compliance failure was a CLAUDE.md problem or a model problem
- Code review becomes “review the human” instead of “review the agent behavior”
The Fix: A Shared CLAUDE.md Baseline
The solution is not to enforce one CLAUDE.md for everyone. Developers have legitimate personal preferences. The solution is to separate shared rules from personal rules.
Layer 1: Shared baseline (committed to the repo)
## Mandatory rules (apply to all contributors)
- Never push to production without explicit confirmation
- Never run destructive database operations without a dry run
- Always explain what you are about to do before executing it
- Do not modify files outside the current task scope
- When in doubt, ask. Do not guess.
## Code standards
- TypeScript: strict mode on all new files, no `any` types
- Tests: do not skip, do not mock without labeling mocks
- Commits: conventional commit format required
Layer 2: Developer-specific additions (gitignored)
# CLAUDE.md.local — Personal additions (not committed)
- I prefer step-by-step explanations over summaries
- Always suggest tests before implementation
Layer 3: A quarterly review
Someone on the team owns the shared CLAUDE.md. Once a quarter, you compare behavior across team members and update the baseline rules to match what actually works.
Rule Ordering Matters Too
Within your shared CLAUDE.md, put the most critical safety rules first. Not last.
Claude Code reads the file top to bottom. As the context window fills across a long session, rules near the bottom get weaker. Your “never push to production” rule should be the first thing Claude reads, not the seventh.
Tooling for Shared CLAUDE.md Management
If you want pre-written, team-tested rules that cover the most common failure modes — scope creep, silent pushes, test skipping, destructive operations, ambiguous instructions — the CLAUDE.md Rules Pack includes a structured template built for teams.
It is not a single file you paste blindly. It is a modular structure: shared baseline in the repo, personal layer gitignored, rule ordering that survives context compaction.
If you want to start for free first: Download the free starter — includes the core structure and a minimal shared baseline you can test with your team today.
Summary
| Problem | Impact |
|---|---|
| Each developer writes CLAUDE.md from scratch | No shared baseline |
| No agreed rule set | Agent behaves differently per developer |
| No rule ordering discipline | Critical rules buried and deprioritized |
| No review process | Drift compounds silently |
The fix: shared committed baseline + gitignored personal layer + quarterly review.
If your team is running Claude Code without a shared CLAUDE.md, you do not have one AI agent. You have five.
CLAUDE.md Rules Pack — structured rules for teams and solo devs | Free starter
CtF Submissions for DEF CON 34 are now open.
Challenge submissions for the AppSec Village Wargame Contest at DEF CON 34 are now open.
Think you have what it takes to make the most interesting AppSec challenge? Now is a good time to get started.
Build challenges with the Open Source SecDim Play SDK and win prizes at DEF CON 34.
👉 https://secdim.com/defcon/
One-Second BLE Pairing: UX and Security Best Practices
- Why the One-Second Pair Is the UX North Star
- Choosing Pairing Modes with Speed and Security in Mind
- Advertising and Scanning Patterns for Instant Discovery
- Bonding, Reconnection, and Key Management
- Handling Pairing Failures and User Recovery
- Practical Checklist for One-Second Pairing
A one-second BLE pairing is not marketing fluff — it’s a systems design constraint. Delivering that blink-fast experience requires synchronizing advertising duty cycle, the selected pairing method, the OS scanner heuristics, and how keys are stored and resolved.
Devices that miss the one-second target show the same symptoms: frustrated users tapping “retry”, poor conversion on first use, and support tickets asking why setup takes so long. You’re seeing long discover times, repeated OS permission dialogs, or pairing stalls where encryption never completes — all of which typically point to mismatched radio schedules or an inappropriate pairing method for the device’s I/O capabilities.
Why the One-Second Pair Is the UX North Star
A fast pairing is the single interaction users remember. When pairing takes seconds rather than milliseconds the product feels unreliable; when it’s instant it feels invisible. For many consumer products the practical goal is to make the first-connect flow complete during the time a user has the phone in hand and attention focused — roughly one second. This means you must budget the sequence: discovery → connect → security handshake → service discovery, and tune each stage to shave milliseconds wherever possible.
- Fast discovery only happens when the peripheral advertises aggressively while the phone actively scans with low-latency settings. The Android Fast Pair workstream demonstrates how OS-level orchestration and special BLE advertisements can dramatically reduce UI friction for first-time pairing and account association.
- Security choice dominates the CPU/latency budget: LE Secure Connections uses P‑256 (ECDH) for authenticated key exchange and is cryptographically stronger than legacy pairing, but it consumes CPU and therefore time on constrained MCUs. Use the Bluetooth Security Manager specification as the reference for methods and their guarantees.
- Advertising intervals and duty-cycle strategies are the practical lever you control in firmware; BLE profiles such as the Heart Rate Profile provide recommended fast/slow advertising cadence patterns (e.g., short aggressive burst windows followed by a long low-power period). Use those patterns as starting points for consumer-facing fast-pair flows.
Choosing Pairing Modes with Speed and Security in Mind
You need a decision framework rather than a single “best” method. Pairing modes trade user friction against MITM protection and CPU cost. The Bluetooth Security Manager enumerates the methods you can use (Just Works, Passkey Entry, Numeric Comparison, OOB) and clarifies which provide MITM protection.
| Pairing Method | MITM protection? | User friction | Speed (typical) | Recommended when |
|---|---|---|---|---|
| Just Works | No | None | Fast | Headless sensors, initial quick-demo; only if threat model allows |
| Passkey Entry / Passkey Display | Yes | Medium (user types or reads) | Moderate | Devices with keypad or display |
| Numeric Comparison | Yes | Low–Medium (user taps confirm) | Moderate | Devices with simple display + phone UI |
| Out-of-Band (OOB) | Yes (strong) | Variable (requires external channel) | Fast (if OOB already available) | Paired ecosystems or secure provisioning |
Concrete rules-of-thumb you can apply:
- When the device has no input and no display,
Just Worksis the only practical initial option; mitigate risk by restricting services until a UX consent step happens in-app. - When the device can show a 6-digit code or accept a code, use passkey pairing for authenticated MITM protection when practical. The security properties are defined in the Security Manager.
- Use OOB (NFC, QR provisioning) when you can — it moves the authentication off-air and can be fast and secure for first-time setup, but requires additional hardware and process changes.
Decision-tree pseudo-code (use this in firmware/product docs and as the basis for acceptance tests):
// Pseudocode: pairing_mode_select()
if (has_display && phone_ui_supports_numeric_comparison) {
return NUMERIC_COMPARISON;
} else if (has_input_or_keypad && can_enter_passkey) {
return PASSKEY_ENTRY;
} else if (oob_channel_available) {
return OOB;
} else {
return JUST_WORKS; // fallback, reduce exposed services until app consent
}
Cite pairing guarantees to the Bluetooth Security Manager for exact trade-offs.
Advertising and Scanning Patterns for Instant Discovery
Discovery is an on-air scheduling problem. Treat advertising as a budgeted resource: high duty cycle for the first 20–30 seconds, then back off. The Heart Rate Profile recommends an initial advertising interval of 20–30 ms for the first 30 seconds and then a lower interval to conserve battery. Use that exact two-phase pattern as your baseline for first-use UX.
Practical advertising primitives and how to use them:
- Use connectable undirected advertising for first-time pairing; switch to directed advertising when reconnecting to a known central to get deterministic, near-instant reconnection. The Link Layer/GAP defines directed advertising and how the TargetA field lets you address a known peer using RPAs or identity addresses.
- Keep advertising packets small and focused: include only the minimum AD fields required for discovery: Service UUID, short local name (if needed), and optionally the
Tx Power LevelAD field (AD Type0x0A) to enable proximity heuristics on the phone. - For Android, prefer
ScanSettingswithSCAN_MODE_LOW_LATENCYand apply aScanFilterfor your service UUID so the OS spends fewer cycles and reports results immediately. The Android BLE guide documents these APIs and explains background vs foreground scanning behavior. - For iOS, use
scanForPeripherals(withServices:options:)and be aware background scanning behaves differently —CBCentralManagerScanOptionAllowDuplicatesKeyis ignored in background and the OS coalesces discovery events to preserve battery. Use service-filtered scans and state restoration for reliable reacquisition.
Example: peripheral advertising pattern (pseudo-C for Zephyr / Nordic SDK)
/* aggressive advertising for initial pairing */
const bt_le_adv_param adv_fast = BT_LE_ADV_CONN_NAME(
BT_LE_ADV_OPT_USE_IDENTITY, // generate RPA when appropriate
0x0014, // 20 ms (0x0014 * 0.625ms => 20ms)
0x001E // 30 ms upper bound
);
bt_le_adv_start(&adv_fast, ad, ARRAY_SIZE(ad), sd, ARRAY_SIZE(sd));
/* after timeout, switch to slow adv: 1s - 2.5s */
Example: Android Kotlin scanner snippet (simplified)
val filter = ScanFilter.Builder()
.setServiceUuid(ParcelUuid(UUID.fromString("0000feed-0000-1000-8000-00805f9b34fb")))
.build()
val settings = ScanSettings.Builder()
.setScanMode(ScanSettings.SCAN_MODE_LOW_LATENCY)
.build()
bluetoothLeScanner.startScan(listOf(filter), settings, scanCallback)
Use allowDuplicates in foreground only when you need continuous RSSI updates or dynamic adv data; avoid it in general because duplicate callbacks cost CPU and power.
Important: Directed advertising for bonded peers gives the fastest reconnection but consumes controller/airtime and should only be enabled briefly when you expect an immediate reconnect. The Link Layer supports high- and low-duty-cycle directed adv modes; prefer low-duty-cycle unless low-latency reconnection is essential.
Bonding, Reconnection, and Key Management
Bonding is what makes the one-second reconnect possible. The security manager defines the keys exchanged during pairing: the Long Term Key (LTK), Identity Resolving Key (IRK), and optional CSRK. The LTK enables encrypted reconnects; the IRK enables resolvable private addresses (RPA) so devices can preserve privacy while still recognizing each other.
Operational checklist you must implement in firmware:
- After a successful pairing that results in bonding, add the peer’s IRK/LTK to the Controller’s resolving list and (optionally) to the controller white list so the controller can resolve RPAs and filter events without waking the host. This reduces host wakeups and power.
- Securely persist keys in protected flash with checksums and versioning. Corruption or an interrupted write must not leave the device with a partially valid bond — provide atomic updates or fallback staging area.
- Implement a deterministic bond eviction policy (LRU or oldest-bond) and expose a clear OTA/maintenance path for handling exhausted bond storage on devices with limited NVM.
- Protect LTKs and IRKs with hardware-backed crypto or secure enclaves when available; do not send keys to cloud backup unless you have a robust threat model and clear user consent.
How reconnection typically works:
- Central starts scanning (often filtered for service UUID).
- Peripheral advertises using an RPA; the controller resolves it using the resolving list (if populated), then the controller/host applies the white list policy and accepts the connection.
- On a reconnect, the central may send the Start Encryption Request using
EDIVandRandto allow the peripheral to look up the correct LTK and resume encryption without re-pairing.
Keep an eye on IRK lifecycle: if a device is reset or a bond is erased on one side the other peer will have stale entries in its resolving list; design the mobile app and device to handle this gracefully (clear stale entries or re-establish bond). Recent Bluetooth work also encourages randomized RPA update strategies that move address randomization into the controller for power and privacy benefits; follow the Core 6.x guidance for controller-offloaded RPA updates if your controller supports it.
Handling Pairing Failures and User Recovery
Pairing failures happen for a small set of repeatable reasons: MITM detected, incompatible IO capabilities, key mismatch after reset, or OS-level permission issues. The Security Manager defines Pairing Failed messages with error codes you can use to diagnose problems.
A robust recovery flow (embed this as telemetry events and a troubleshooting UI step):
- Detect and log the
Pairing Failederror code and increment a per-device failure counter. - On the mobile app, show a single concise instruction: “Put the device into pairing mode (hold X for Y seconds) — reconnecting will be automatic.” Avoid verbose security explanations. Use visuals; people scan for an instruction and the timer.
- If the device fails to respond after N attempts, trigger a bond reset option: this should clear the device’s local keys and the host-side bond (present “Forget this device” pattern). Make the reset action explicit and protected (long press / hardware button) so it’s not accidentally triggered.
- If automatic reconnection fails because of an RPA/IRK mismatch (common after factory reset of the peripheral), have the mobile app attempt a fresh discovery (no white-list) and present a guided re-pair flow; include a “factory reset” fallback path if necessary.
Diagnostics to report in logs and support tools:
- HCI/LL events for advertisement reception and resolution success/failure.
- Pairing Failed code and the IO capability negotiation values.
- Key store status (number of bonds, last bond timestamp).
Use that data to refine the device’s advertising window, pairing method, or NVM bonding capacity.
Practical Checklist for One-Second Pairing
Below is a deployable checklist you can use in sprint planning, firmware releases, and mobile-app acceptance tests.
Firmware checklist
- [ ] Implement two advertising modes: fast initial (20–30 ms intervals for ~20–30 s) and slow background.
- [ ] Support connectable undirected advertising for first-time pairing, and directed connectable advertising for fast reconnects to bonded devices.
- [ ] On successful bonding: store LTK/IRK atomically, populate the Controller resolving list, and optionally add to the controller white list.
- [ ] Provide a secure, user-accessible factory-reset method to clear bonds.
Mobile app checklist
- [ ] Use OS filtering: Android
ScanFilter+SCAN_MODE_LOW_LATENCY. - [ ] For iOS, scan for specific service UUIDs and implement state preservation/restoration for background reconnections.
- [ ] Keep the pairing UI focused: one action, visible progress (0–100%), and clear failure text that maps to device hardware steps.
- [ ] Implement robust “forget device” and “retry pairing” flows in the app with telemetry for failures.
Testing matrix (minimum)
- First-time pairing: clean phone, clean device.
- Reconnect after sleep: bonded device reconnects when in range.
- Reconnect after peripheral reboot: keys present on phone, device restarted.
- Reconnect after phone factory reset: peripheral must accept new bond.
- Bond capacity: exceed N bonds and validate eviction policy.
- RPA resolution tests: verify controller resolves RPAs when resolving list is full vs not full.
Sample acceptance test for “one-second” (practical)
- Setup: phone screen awake, app in foreground, device 50 cm from phone.
- Criteria: discovery + connect + secure pairing + service access completes < 1s in 9/10 runs; log distribution to find outliers. Use real-world reference phones, and measure with automated scripts as part of your QA runs. Note: certification testbeds (e.g., Fast Pair validator) have formal pass/fail metrics that can be stricter or different in scope.
Sources
Bluetooth Core Specification — Part H: Security Manager Specification – Definitions of pairing methods (Just Works, Passkey, Numeric Comparison, OOB), key distribution (LTK, IRK, CSRK), and Pairing Failed semantics used to reason about MITM and key-management trade-offs.
Bluetooth Heart Rate Profile (Profile guidance on advertising intervals) – Practical recommended advertising cadence (e.g., 20–30 ms fast window then slower background intervals) used as a baseline for consumer fast-pair flows.
Bluetooth Core Specification — Generic Access Profile & Link Layer (directed advertising, resolving list) – Rules for directed vs undirected advertising, resolvable private address (RPA) resolution and how the resolving list and target address fields work.
Bluetooth® Technology Blog — Randomized RPA Updates (privacy & controller offload) – Recent guidance on controller-offloaded/resolution and randomized RPA updates that affect privacy and power trade-offs.
Google Fast Pair Service — Introduction & BLE device spec – Fast Pair design and features that show how OS-level integration and a special BLE advertising flow reduce user friction for instant pairing.
Android Developers — Bluetooth Low Energy (BLE) Overview – Official Android guidance for scanners: ScanFilter, ScanSettings (low-latency), and background/foreground scanning behavior referenced for mobile-side orchestration.
Apple Developer — Core Bluetooth Background Processing for iOS Apps (archived) – Official Apple guidance on scanning and advertising differences when apps are in background, duplicate coalescing, and state preservation.
Bluetooth Assigned Numbers — AD Types & Characteristics (Tx Power, Reconnection Address) – AD Type mapping (0x0A = Tx Power Level) and GATT characteristic UUID references (e.g., Reconnection Address) for advertising payload design.
SimpleLink BLE5 Stack — GAP Bond Manager / Resolving List (TI docs) – Practical description of the resolving list and white list semantics and how controller-side lists are maintained for power-efficient reconnection.
Nordic DevZone — scanning/extended advertising discussion (practical Android/extended adv notes) – Field discussion and pointers about extended advertising, Android scanning incompatibilities (legacy vs extended), and practical developer observations when implementing modern advertising schemes.
A one-second pair is an orchestration problem: align your advertising, choose the right pairing method for the device’s I/O, populate the resolving/white lists on the controller, and design the mobile app to scan and connect aggressively only during the initial pairing window; when those pieces run in lockstep the pairing disappears into the background and your product feels polished.
Top Agentic Frameworks for Building Applications 2026
In 2026, the world of AI is changing at a serious pace. The days of AI systems dealing solely in single-prompt interactions are coming to an end. Instead, these models are evolving into agentic systems – long-running, goal-driven software enabled by agentic frameworks that are becoming a critical layer in modern application architecture.
This rapid shift means that Python developers building autonomous systems are increasingly relying on agentic frameworks to manage reasoning, memory, tools, and collaboration among multiple agents.
You’ve probably already heard of some of the most popular frameworks. LangChain and AutoGen have risen to prominence, but there are dozens more, many of them open-source and only one to two years old. With so many frameworks promising different agentic capabilities, the real challenge is knowing which ones are best suited for the kind of application you want to build.
Let’s take a closer look at some of the most important agentic frameworks on the market in 2026, comparing what each does best and rating them based on our key comparison criteria to help you discover which is best for your projects.
What are AI agents?
An AI agent is a piece of software capable of autonomously reasoning, setting goals, and performing tasks on behalf of a user or another system. As the name suggests, AI agents have a level of agency to learn, adapt, and make decisions independently. This means they can improve their behavior and, over time, choose their own actions to achieve specific goals or outcomes.
AI agents work by following a perceive, reason, act, reflect (PRAR) cycle, which allows them to:
- Perceive: Observe the environment, including user input, system state, tools, and memory, to understand the current context and constraints of the task.
- Reason: Plan, make decisions, and select actions using a large language model (LLM) or hybrid logic.
- Act: Execute actions like calling tools, updating memory, or triggering workflows.
- Reflect: Evaluate the outcome of previous actions and adjust future decisions, plans, or prompts to improve results.
AI agents rely on the natural language processing capabilities of large language models, but unlike traditional LLMs and AI chatbots, they don’t require continuous user input to perform tasks. Agents are proactive, working autonomously to achieve a goal based on a specified set of rules and parameters.
What is an agentic framework?
An agentic framework provides the infrastructure needed to build, run, and control AI agents at scale. Most modern frameworks offer three core capabilities:
- Orchestration: Controls how agents are sequenced, coordinated, or allowed to collaborate.
- Tools: Define how agents interact with external systems like APIs or databases.
- Memory: Sets out how agents retain and retrieve information across steps or sessions.
While it’s possible to build an agent without a framework, they’re vital in ensuring agents are reliable, scalable, and safe.
Agentic frameworks help turn experimental agent builds into maintainable software by facilitating:
- Multi-agent coordination: When multiple agents communicate to plan, work together, and specialize in different areas of a task.
- Human-in-the-loop (HITL) checkpoints: Intentional pause points where a human can review what an agent is about to do.
- Observability, control, and reproducibility: The ability to see what an agent is doing, guide agent behavior, or re-run an agent and receive the same results.
Core orchestration paradigms
Before comparing individual frameworks, it’s important to understand how they operate. Let’s look at the three most commonly used orchestration models in 2026.
Graph-based orchestration
Graph-based orchestration provides maximum control by organizing agents and tools as nodes in a directed graph. Instead of letting an agent freely decide what to do next, the flow that agents are allowed to follow is clearly defined.
Strengths
- More deterministic control: Predictable behavior is critical for production systems that require reliable results.
- Easier debugging: Pinpoint exactly which node failed thanks to clear checkpoints and boundaries.
- Production-grade reliability: This approach is ideal for customer-facing applications, enterprise systems, or regulated environments.
Limitations
- More upfront design: The workflow must be defined in advance, which slows initial development.
- Less “emergent” behavior: Agents are constrained by the graph, leaving less room for experimentation and creativity.
Role-based orchestration
Role-based orchestration is most effective when simplicity is a priority. Agents are assigned specific roles, such as “Planner”, “Researcher”, or “Builder”, and collaborate by sending messages to one another.
Strengths
- Intuitive mental model: This type of operation is easy to understand because it effectively mirrors how human teams work.
- Rapid prototyping: Minimal setup is required, allowing more time to explore outcomes.
Limitations
- Harder-to-constrain behavior: Because agents have the freedom to decide what to do next, it’s difficult to enforce strict execution paths.
- Limited determinism: The same input can yield different outcomes, making it tricky to reproduce results and achieve consistency.
Chain-based orchestration
Chain-based orchestration, also known as adaptive orchestration, arguably offers the greatest flexibility. Agents in this model operate in dynamic chains or loops, deciding the next step autonomously.
Strengths
- Flexible workflows: Agents are not constrained to a pre-defined path and can freely explore different strategies.
- Suitability for creative tasks: This approach is ideal for research, discovery, and experimentation, as agents can iteratively explore ideas, pivot strategies, and adapt their approach.
Limitations
- Less predictability: Testing and debugging are more challenging because execution paths are harder to reproduce and trace.
- More difficult governance at scale: This unpredictability grows as tasks become more complex.
Best agentic frameworks for your projects
Now that we’re familiar with the key orchestration paradigms of agentic frameworks, it’s time to compare some of the most popular frameworks on the market in 2026. Below, we evaluate each framework’s performance against our key comparison criteria:
- Primary orchestration model.
- Multi-agent support.
- Memory capabilities.
- Human-in-the-loop (HITL) support.
- Best-fit applications.
| Framework | Orchestration model | Multi-agent support | Memory capabilities | HITL support | Best used for |
| LangChain | Chain-based | Partial | Moderate | Limited to moderate | Rapid LLM app development |
| LangGraph | Graph-based | Yes | Strong | Strong | Production-grade agent workflows |
| LlamaIndex | Retrieval-centric | Limited | Strong | Moderate | Knowledge-heavy agents |
| Haystack | Pipeline-based/modular | Moderate | Strong | Moderate | Production RAG and context-heavy AI systems |
| AutoGen | Role-based | Strong | Moderate | Limited | Conversational multi-agent systems |
| CrewAI | Role-based | Strong | Light | Limited | Task-oriented agent teams |
| Semantic Kernel | Planner-based | Moderate | Moderate | Strong | Enterprise AI |
| smolagents | Minimalist | Limited | Light | Minimal | Lightweight experiments |
| OpenAI Agents SDK | Graph-based | Yes | Managed | Strong | Hosted agent applications |
| Phidata | Agent-centric | Limited to moderate | Strong | Moderate | Data and tool-heavy agents |
Let’s take a closer look at the strengths and weaknesses of each framework, along with the applications they’re most suited to.
LangChain
- Core design: Chain-based orchestration.
- Philosophy: Developer velocity and flexibility.
Launched in 2022, LangChain is one of the most widely adopted frameworks due to its broad ecosystem of integrations. It serves as an accessible interface for nearly any LLM and is an ideal starting point for enthusiasts or startups looking to explore agentic AI. While not strictly “agent-first”, it provides the building blocks for agentic behavior.
LangChain provides less control than other frameworks, but it’s still a fantastic entry point into agentic systems, especially for projects where speed and creativity take precedence over enforcing strict workflows.
Strengths
- Huge ecosystem.
- Easy tool integration.
- Rapid prototyping.
Limitations
- Less control than graph-based systems.
- Agent logic that can be difficult to understand as it grows in complexity.
Best applications
- Prototyping of agentic features.
- Tool-augmented chatbots.
- LLM-powered backend services.
If you want to go beyond the basics, read our LangChain Python Tutorial: A Complete Guide for 2026. It takes a deeper look at what LangChain offers and walks through real-world use cases for building AI agents in Python.
LangGraph
- Core design: Graph-based orchestration.
- Philosophy: Explicit control over agent behavior.
LangGraph has emerged as the leading standard for production-grade agent systems. Built on top of LangChain, it replaces implicit chains with explicit graphs, providing strict control over workflows and excellent HITL support via interrupts.
While the graph structure itself can actually make debugging easier by clearly mapping how agents and tools interact, LangGraph does come with a learning curve. Much of this complexity comes from designing the graph and managing explicit state between nodes. Once you understand these concepts, the framework becomes a powerful option for building predictable and controllable agent systems.
Strengths
- Deterministic workflows.
- Native state management.
- Excellent HITL support via interrupts.
- Suitability for regulated or mission-critical systems.
Limitations
- Higher upfront design effort.
- Steeper learning curve due to explicit graph and state management.
- Reduced flexibility for open-ended tasks.
Best applications
- Autonomous customer support systems.
- AI-driven DevOps workflows.
- Multi-step decision engines.
LlamaIndex
- Core design: Retrieval-centric orchestration.
- Philosophy: Data-first agents.
LlamaIndex is a Python framework designed to help AI systems understand, store, and retrieve information from large amounts of documents and data.
Rather than starting with agents and adding data later, LlamaIndex takes the opposite approach – it starts with data and then builds agent behavior around it. This is why it is often described as data-first or retrieval-centric.
Because it operates in this way, LlamaIndex excels at indexing, memory, and retrieval, making it ideal for building agents whose intelligence depends on accessing the right information rather than executing complex actions.
Strengths
- Advanced document indexing.
- Strong long-term memory patterns.
Limitations
- Limited suitability for complex, action-heavy orchestration.
- Limited support for multi-agent orchestration.
Best applications
- Research assistants.
- Knowledge base agents.
- Enterprise document intelligence.
Haystack
- Core design: Modular pipeline orchestration.
- Philosophy: Context engineering and production-ready AI systems.
Haystack is an open-source AI orchestration framework created by deepset for building production-ready AI agents, retrieval-augmented generation (RAG) systems, and multimodal applications.
Instead of focusing purely on agent behavior, Haystack structures applications as explicit pipelines composed of retrievers, routers, memory layers, tools, evaluators, and generators. This modular architecture gives you control over how information flows through a system, allowing each component to be tested and improved independently.
Haystack is particularly strong in applications where the quality of retrieved information determines the quality of the model’s output. Its design also makes it well-suited for enterprise environments that require transparency and reliability in production systems.
Strengths
- Highly modular pipeline architecture.
- Excellent support for RAG and document processing.
- Strong ecosystem, particularly in search and RAG-focused enterprise use cases.
- Flexible integrations with models and vector databases.
Limitations
- More infrastructure and setup than lightweight frameworks.
- Less focus on emergent multi-agent collaboration.
Best applications
- Retrieval-augmented generation (RAG) systems.
- Enterprise document intelligence.
- Data-heavy AI applications.
- Production AI pipelines that require strong context control.
AutoGen
- Core design: Role-based multi-agent collaboration.
- Philosophy: Conversation-driven autonomy.
AutoGen, an open-source Microsoft framework, popularized the idea of agents collaborating through structured conversation, organizing systems as teams of agents, each with its own specific role. Unlike in other frameworks, there’s no central controller enforcing a strict execution path – the collaboration itself drives progress.
This approach makes AutoGen ideal for exploratory, creative, and research-driven multi-agent systems, at the cost of predictability, HITL, and strict execution control.
Strengths
- Natural multi-agent interaction.
- Minimal orchestration overhead.
- Suitability for emergent problem-solving.
Limitations
- Limited execution control.
- Weak HITL support.
Best applications
- Coding agents.
- Brainstorming systems.
- AI research experiments.
CrewAI
- Core design: Role-based task delegation.
- Philosophy: Teams of specialized agents.
CrewAI is centered around building simple, structured multi-agent systems. It is similar to AutoGen, modeling AI agents as members of a “crew” where each agent has a clearly defined role. The goal is to make multi-agent systems approachable, even if you are new to agentic AI.
CrewAI prioritizes simplicity and speed over deep memory and production controls, making it easy to learn and a strong option for prototypes and small teams. However, its limited toolset for observability, HITL, and error handling at scale makes it less suited for larger systems.
Strengths
- Very approachable API.
- Clear role separation.
- Fast setup.
Limitations
- Lightweight memory.
- Limited production controls.
Best applications
- Content pipelines.
- Market research automation.
- Simple workflow agents.
Semantic Kernel
- Core design: Planner-based orchestration.
- Philosophy: Enterprise-grade AI integration.
Semantic Kernel is another open-source Microsoft framework, designed for building AI-powered applications that integrate with existing enterprise systems.
It was created with production concerns in mind from the start, emphasizing governance, safety, observability, and human oversight. Rather than maximizing agent autonomy, it focuses on making AI predictable, controllable, and auditable.
By combining structured workflows with LLM reasoning, it trades flexibility and emergent behavior for trust, safety, and operational reliability.
Strengths
- Strong HITL support.
- Enterprise-friendly architecture.
- Good observability.
Limitations
- Heavier upfront structure.
- Less flexibility for open-ended autonomy.
- Steeper learning curve.
Best applications
- Internal enterprise tools.
- AI copilots.
- Business process automation.
smolagents
- Core design: Minimalist chain-based.
- Philosophy: Simplicity over scale.
smolagents is a bare-bones framework designed to make agentic AI as straightforward and transparent as possible. It prioritizes simple, readable code that makes it easy to understand how an agent works without needing to learn a large framework.
smolagents aims to make agent behavior accessible and easy to experiment with by keeping abstractions minimal and logic transparent. It offers first-class support for code-based and tool-calling agents, broad model and tool compatibility, and lightweight CLI utilities, while intentionally trading large-scale orchestration and production features for simplicity and clarity.
Strengths
- Extremely lightweight design.
- High degree of transparency.
- Fast experimentation.
Limitations
- Limited suitability for scaling
- Minimal production features.
Best applications
- Educational projects.
- Proofs of concept.
- Lightweight local agents.
OpenAI Agents SDK
- Core design: Managed workflow-driven orchestration (often graph-based).
- Philosophy: Hosted, production-ready agents.
Thanks to ChatGPT’s explosion in popularity, we’ve all heard of OpenAI. The Agents SDK is the company’s effort to provide a managed platform for building and running agents without having to maintain your own orchestration infrastructure.
Rather than assembling agents from scratch, you define agent behavior and workflows, while OpenAI provides orchestration, memory management, monitoring, and safety controls. This makes the Agents SDK particularly attractive for teams that want production-ready agents quickly.
Strengths
- Minimal infrastructure burden.
- Built-in safety and observability.
- Strong multi-agent support.
Limitations
- Reduced customization and control.
- Limited suitability for experimental research.
Best applications
- SaaS agent features.
- Customer-facing autonomous systems.
- Teams prioritizing speed over customization.
Phidata
- Core design: Agent-centric, tool-heavy.
- Philosophy: Practical agents for real-world data tasks.
Phidata is designed for building practical, tool-driven AI agents that operate on real-world data.
Rather than focusing on abstract orchestration patterns, Phidata centers the agent around direct interaction with systems such as APIs, databases, and internal services.
Its design reflects the fact that many agents spend most of their time fetching, transforming, and acting on data.
Strengths
- Strong tool integration.
- Suitability for data-centric workflows.
Limitations
- Less emphasis on orchestration.
- Limited multi-agent capabilities.
Best applications
- Data analysis agents.
- Finance and ops automation.
- Tool-driven decision systems.
Choosing the right framework
Now that you’re familiar with many of the most popular frameworks in 2026, it’s time to choose the right one for your project. Let’s take a look at some of the key use cases, along with the frameworks that fit them best.
| Orchestration model | Where to use | Recommended frameworks |
| Graph-based | Projects involving complex branching logic and requiring high levels of reliability, auditability, and control. | LangGraph, OpenAI Agents SDK |
| Role-based | Projects involving rapid development and intuitive design that benefit from emergent collaboration between agents. | AutoGen, CrewAI |
| Chain-based | Projects requiring maximum flexibility, where agents need to adapt dynamically and determine next steps autonomously. | LangChain |
| Retrieval-based | Projects where deep, reliable access to knowledge matters more than high levels of autonomy. | LlamaIndex, Haystack |
| Enterprise-oriented | Projects where strong governance and human-in-the-loop processes are non-negotiable requirements. | Semantic Kernel |
| Lightweight | Rapid prototyping, educational use, and simple local agents where transparency and control matter more than orchestration complexity. | smolagents |
| Tool-centric | Building production agents that primarily interact with APIs, databases, and external systems rather than complex multi-step orchestration. | Phidata |
In 2026, agentic frameworks have evolved from experimental tools into foundational infrastructure for many applications. The key decision is no longer whether to use agents, but how much control, autonomy, and governance your systems require.
Toolbox App 3.5: Better Remote Development Observability, More Reliable Enterprise Configuration, and Smoother Everyday Interactions
Toolbox App 3.5 focuses on making daily work smoother and managed development environments easier to monitor. The app now supports interface zooming with familiar shortcuts, provides OpenTelemetry metrics for enterprise remote development connections, and handles several long-standing reliability issues more gracefully.
Remote development observability
The Toolbox App now emits OpenTelemetry metrics for remote development connection latency and reliability. You can send them to Grafana, Datadog, Prometheus, or another OTEL-compatible stack to monitor connection health across your developer fleet.
Zoom controls
You can now zoom the Toolbox App interface using familiar keyboard shortcuts: Cmd/Ctrl + to zoom in, Cmd/Ctrl – to zoom out, and Cmd/Ctrl 0 to reset. The setting persists across restarts, so your preferred zoom level is preserved.
Cleaner update progress
Checking for updates no longer hides behind a generic spinner. You’ll now see what the app is checking, what it’s unpacking, and how far along it is – providing a clearer sense of progress.
Enterprise configuration
For enterprise customers using JetBrains IDE Services, the Toolbox App now sends static and dynamic headers together when communicating with backend services. Header updates are also pushed automatically to running IDEs – no need to restart an IDE to pick up new headers.
Bug fixes
- IntelliJ-based IDEs no longer randomly disappear from the Toolbox App home view.
- Android Studio and other aliased IDEs keep their display name after updates.
- The taskbar icon on KDE Plasma 6.6 and the tray and app icons on Pop!_OS now appear reliably.
Remote development fixes
- SSH canonicalization failures no longer abort the connection.
- The remote development environment list no longer shows an empty page when the canCreateNewEnvironments flag is set.
We’d love to hear your thoughts on Toolbox App 3.5! Your feedback helps us improve the product, so please share your experience in the comments.
The JetBrains Toolbox App team
