CA 04 – Two Sum & Sorted Two Sum

TWO SUM PROBLEM

I solved the Two Sum problem from LeetCode in java using brute force method. The problem gives an array of integers and a target value. I need to find two indices so that the numbers at those indices add to the target. and i can’t use the same element twice, and there will always be one valid answer.

To solve this, I used a simple method. I took one number from the array and compared it with every number that comes after it. For every pair i checked if their sum is equal to the target. If it is i need to return the indices of those two numbers.

I used two loops for this. The first loop picks one number, and the second loop checks the remaining numbers. As soon as it find a pair that matches the target, it return the answer.

This method works correctly but takes more time because it checks all pairs. The time complexity is O(n^2). The space complexity is O(1) because no extra space is used.

class Solution {
    public int[] twoSum(int[] nums, int target) {
        for(int i = 0; i < nums.length; i++) {
            for(int j = i + 1; j < nums.length; j++) {
                if(nums[i] + nums[j] == target) {
                    return new int[] {i, j};
                }
            }
        }
        return new int[] {};
    }
}

SORTED TWO SUM

The task is to find two numbers such that their sum equals the target and return their positions. I also need to add 1 to the indices before returning them.

To solve this, i used a simple brute force method. It checked every possible pair of numbers in the array and saw if their sum equals the target.

I used two loops for this. The first loop picks one number, and the second loop checks all the numbers after it. For each pair, It added the two numbers and compared the result with the target. If the sum matches, it return the indices. return i + 1 and j + 1.

class Solution {
    public int[] twoSum(int[] numbers, int target) {
        for(int i = 0; i < numbers.length; i++) {
            for(int j = i + 1; j < numbers.length; j++) {
                if(numbers[i] + numbers[j] == target) {
                    return new int[] {i + 1, j + 1};
                }
            }
        }
        return new int[] {};
    }
}

This method works correctly but it takes more time. The time complexity is O(n^2) because of the nested loops. The space complexity is O(1) since no extra space is used.

Claude Opus 4.6 vs. GPT 5.4

Claude Opus 4.6 vs. GPT 5.4: My Take as a C#/.NET Dev on AI Coding Companions

Alright team, let’s talk AI. As a senior engineer who’s spent more years than I care to admit wrangling C# and .NET, I’ve seen my fair share of “game-changing” tech. Most of it is just hype. But these new-gen LLMs? They’re different. We’re talking about legitimate productivity boosters, especially when you’re staring down a tricky bug or architecting a new microservice.

Lately, I’ve been putting the big two — Claude Opus 4.6 and GPT 5.4 — through their paces specifically for coding tasks. The question isn’t if they’re useful, but which one to bring to the fight, or if we should be thinking “both.” Let’s dive into my real-world experiences.

The Setup: My C#/.NET AI Playground

Before we get into the nitty-gritty, a quick word on my testing environment. I wasn’t just asking them to write “Hello World.” I was throwing real-world problems at them: building complex LINQ queries, designing a robust API controller, refactoring legacy code, even trying to get them to write xUnit tests for some tricky asynchronous logic.

I wanted to see how they handled:

  • Context: Can they keep track of a larger codebase or conversation?
  • Precision: Do they generate code that actually compiles and runs correctly the first time?
  • Nuance: Can they understand why I’m asking for something, not just what?
  • Debugging: How good are they at finding issues in their own code or mine?

GPT 5.4: The Speedy Generalist with a Few Surprises

GPT 5.4 feels like that incredibly bright junior developer who’s read every programming book but sometimes misses the specific context of our project. It’s fast, incredibly broad in its knowledge, and often provides surprisingly elegant solutions right out of the gate.

When I needed boilerplate code for a new DbContext or a standard ASP.NET Core controller, GPT 5.4 was lightning-fast and usually spot-on. It’s fantastic for generating common design patterns or even suggesting different approaches to a problem.

Where it Shines:

  • Broad Knowledge Base: If it’s a common C# pattern, a widely used .NET library, or a general algorithm, GPT 5.4 knows it.
  • Code Generation Speed: It often generates long code blocks quickly, perfect for getting a first draft down.
  • Exploration: Great for brainstorming different ways to solve a problem or exploring new libraries.

Where I Pump the Brakes:
Sometimes, GPT 5.4 can be a bit too confident. It occasionally generates plausible-looking code that has subtle bugs, or it might make assumptions about my project that aren’t true. I’ve also found it can “forget” earlier parts of our conversation if the thread gets too long. It’s like it gets distracted by the next shiny coding problem.

GIF

Claude Opus 4.6: The Meticulous Architect, Slow and Steady

Claude Opus 4.6, on the other hand, feels more like a seasoned architect. It’s often slower to respond, but its answers tend to be incredibly thoughtful, detailed, and deeply contextual. It seems to “think” more before responding, often asking clarifying questions or laying out its reasoning step-by-step.

For complex refactoring tasks, or when I was trying to optimize a specific piece of asynchronous code for performance, Claude truly stood out. It provided not just the code, but the rationale behind the choices, often citing best practices or potential pitfalls. It felt like pair programming with someone who meticulously considers every angle.

Where it Shines:

  • Contextual Understanding: It maintains context over very long conversations, making it excellent for multi-step tasks or complex debugging.
  • Deep Reasoning: Its explanations are often superb, breaking down complex problems and justifying its code choices.
  • Fewer Hallucinations: I’ve found it to be more reliable in generating correct, runnable code without subtle errors. It double-checks its work, which is invaluable.
  • Refactoring & Debugging: Excellent at identifying issues in existing code and suggesting robust improvements.

Where I Feel the Pinch:
The speed. Sometimes, when you just need a quick IEnumerable extension method or a simple DI setup, waiting for Claude’s detailed explanation can feel a bit overkill. It’s not a rapid-fire code generator in the same way GPT 5.4 can be.

The Verdict: Don’t Choose, Combine!

After weeks of real-world use, my conclusion is clear: you don’t have to pick just one. These aren’t competitors; they’re complementary tools in a modern software engineer’s arsenal.

Think of it this way:

  • Reach for GPT 5.4 when:

    • You need rapid prototyping or boilerplate generation.
    • You’re exploring new libraries or frameworks and need quick examples.
    • You’re stuck on a common problem and need a few different potential solutions fast.
    • You need simple, isolated code snippets.
  • Reach for Claude Opus 4.6 when:

    • You’re working on complex architectural decisions or refactoring significant parts of your codebase.
    • You need detailed explanations, best practices, and a deeper understanding of why certain code is structured a certain way.
    • You’re debugging persistent, tricky issues and need a methodical, logical approach.
    • You have a long, ongoing conversation about a specific problem and need the AI to maintain deep context.

Technical Image

I often start with GPT 5.4 for initial drafts or quick ideas. Then, if I hit a wall, or if the problem requires more nuanced reasoning, I’ll port the conversation (or at least the core problem) over to Claude Opus 4.6 for a more in-depth architectural review or a meticulous debugging session. It’s like having a brilliant junior dev for the grunt work and an experienced architect for the heavy lifting.

Your AI Pair Programming Partner(s)

Adopting these tools isn’t about replacing engineers; it’s about augmenting our capabilities. It’s like having a super-powered pair programmer who never gets tired and has read the internet. For us C# and .NET folks, understanding the strengths of both Claude Opus 4.6 and GPT 5.4 means we can write better code, faster, and with fewer headaches.

What are your experiences? Have you found one to be clearly superior for your specific tech stack, or are you also seeing the value in a multi-model approach? Let me know in the comments!

Stop Fighting Zustand Context: Practical Store Scoping Patterns for React

Zustand is one of the rare state management libraries that feels good almost immediately. It is small, fast, and does not try to force a framework-sized architecture onto your app.

That simplicity is exactly why many teams adopt it quickly.

Then the app grows, and a different problem shows up: scoped state.

What happens when your app needs multiple, isolated instances of the same store? Imagine a dashboard where each complex “widget” needs its own independent state or a multi-step “wizard” where simultaneous tabs shouldn’t overwrite each other’s data.

The official Zustand documentation recommends using React Context for this, but doing it manually is a grind. You have to:

  1. Create a React Context.
  2. Create a factory function for the store instance.
  3. Build a wrapper Provider component.
  4. Manually rebuild strongly-typed selector hooks (useStore, useStoreApi) for consumers.
  5. Pepper your codebase with useShallow to prevent unnecessary re-renders when returning objects or arrays.

At that point, plain Zustand is still capable, but the implementation starts getting repetitive.

To reduce that boilerplate, I built @okyrychenko-dev/react-zustand-toolkit.

It gives you a few composable helpers around Zustand:

  • generated context providers
  • shallow-first selectors by default
  • “resolved” hooks that can read from either a scoped or global store
  • a small set of React 19 utilities

The goal of this article is not to oversell that toolkit. It is to show the real architectural cases where it helps, where it does not, and how its three main factory functions map to actual React state ownership patterns.

Before We Start: The Real Problem

Zustand itself is not the problem. In many apps, plain Zustand is already enough:

  • one global store
  • a few focused selectors
  • occasional middleware
  • no need for isolated store instances

The pain starts when your architecture stops being purely global.

That usually happens in one of these situations:

  • you render the same complex widget multiple times and each instance needs separate state
  • you build reusable modules that should work standalone and also inside a larger application
  • you want most of the app to read from one global store, but a subtree should temporarily override it
  • you are tired of repeating provider + context + hook wiring for every isolated Zustand use case

That is the exact gap this toolkit is trying to cover.

So while this article shows the library API, the more important takeaway is architectural:

  • use a plain global store when isolation is not needed
  • use scoped providers when identity and lifetime matter per subtree
  • use resolved hooks when consumers should not care where the state comes from

With that framing in place, the API makes much more sense.

1. The Global Singleton: createShallowStore

Let’s start with the simplest layer.

If you are just building a standard global store, the main reason to use this layer is shallow-first selectors.

In standard Zustand, if your selector returns a new object or array, your component will re-render every single time the store updates, even if the selected values haven’t changed. To fix this, you have to manually wrap your selectors:

// ❌ Standard Zustand requires boilerplate for shallow picks
import { useShallow } from 'zustand/react/shallow'

const { id, name } = useUserStore(
  useShallow((state) => ({ id: state.id, name: state.name }))
)

With createShallowStore, your generated hooks use zustand/shallow by default. You can pick objects and arrays freely without the boilerplate:

import { createShallowStore } from "@okyrychenko-dev/react-zustand-toolkit";

interface SessionStore {
  token: string | null;
  user: { name: string; role: string } | null;
  login: (token: string, user: { name: string; role: string }) => void;
}

const { useStore, useStorePlain, useStoreApi } = createShallowStore<SessionStore>((set) => ({
  token: null,
  user: null,
  login: (token, user) => set({ token, user }),
}));

// ✅ Object picks use shallow comparison by default.
function ProfileInfo() {
  const { token, user } = useStore((state) => ({
    token: state.token,
    user: state.user
  }));

  return <div>{user?.name}</div>;
}

If you ever need standard, strict-equality behavior, the toolkit always provides explicit useStorePlain alternatives.

Why this matters in practice

The shallow-first approach is especially useful when components naturally want to read small object bundles:

const { isLoading, error, reload } = useStore((state) => ({
  isLoading: state.isLoading,
  error: state.error,
  reload: state.reload,
}));

In plain Zustand, patterns like this often push teams into one of two habits:

  • wrapping selectors in useShallow
  • splitting every field into its own selector call

Both work. They are just noisy when repeated across a large codebase.

This helper does not replace selector discipline. It simply makes the common “pick a few fields” case less repetitive.

What it does not do

It is still important to be precise about the limits:

  • it does not make every selector free
  • it does not replace good store design
  • it does not solve deep comparison problems
  • it does not remove the need to think about derived data and subscription granularity

It mainly improves the ergonomics of shallow object and array picks.

2. Isolated Store Contexts: createStoreProvider

The next layer is where Zustand usually becomes a little more manual.

When you need true isolation, where every instance of a component must own a separate store, createStoreProvider removes most of the repetitive setup.

It generates the Context, the Provider component, and the typed consumer hooks in a single call.

import { createStoreProvider } from "@okyrychenko-dev/react-zustand-toolkit";

interface WizardStore {
  step: number;
  direction: 'forward' | 'backward';
  next: () => void;
}

// 1. Generate the provider and hooks
export const { 
  Provider: WizardProvider, 
  useContextStore,
  useContextStoreApi 
} = createStoreProvider<WizardStore>((set) => ({
  step: 1,
  direction: 'forward',
  next: () => set((state) => ({ 
    step: state.step + 1, 
    direction: 'forward' 
  })),
}), "Wizard");

// 2. Consume safely within the isolated tree
function WizardControls() {
  const step = useContextStore((state) => state.step);
  const next = useContextStore((state) => state.next);

  return (
    <div>
      <p>Current Step: {step}</p>
      <button onClick={next}>Next Step</button>
    </div>
  );
}

Why provider-scoped Zustand is useful

Context-scoped stores are not just an implementation detail. They model a different ownership pattern.

With a global singleton store:

  • the store exists once
  • every consumer shares the same data
  • state lifetime usually matches the application lifetime

With a provider-scoped store:

  • each provider instance owns one store
  • sibling subtrees can hold completely different values
  • state lifetime follows the mounted subtree

That makes provider-scoped stores a good fit for:

  • wizards
  • modals with complex internal state
  • embeddable widgets
  • repeated dashboard panels
  • request or test isolation

Provider Lifecycle Hooks

Sometimes you need to initialize your isolated store with data from outside (like props) before the component renders, or run a side effect right after it mounts.

The generated Provider component supports two lifecycle stages:

  • onStoreInit: Synchronous initialization during store creation.
  • onStoreReady: Post-commit side effects.
function WizardShell({ initialStep }: { initialStep: number }) {
  return (
    <WizardProvider
      onStoreInit={(store) => {
        // Initialize the store synchronously before first render
        store.setState({ step: initialStep });
      }}
      onStoreReady={(store) => {
        // Run side effects like analytics tracking after mount
        console.log("Wizard instance mounted at step", store.getState().step);
      }}
    >
      <WizardControls />
    </WizardProvider>
  );
}

That split is small, but useful:

  • onStoreInit is for deterministic setup before consumers read the store
  • onStoreReady is for effects that should happen after mount

That is a better mental model than mixing initialization and side effects in the same callback.

3. The Best of Both Worlds: createStoreToolkit

This is the layer that makes the package feel more like a toolkit and less like a single helper.

What if you have global state, but certain parts of the UI need to override it locally?

This is where createStoreToolkit becomes useful.

It creates both a global singleton store and an optional context provider. It also gives you resolved hooks such as useResolvedValue and useResolvedStoreApi.

These hooks dynamically check the React Component tree:

  1. Are we inside a Provider for this store? If yes, use the scoped context store.
  2. No Provider found? Fall back to the global singleton store.

Take a look at this Theme example:

import { createStoreToolkit } from "@okyrychenko-dev/react-zustand-toolkit";

interface ThemeStore {
  mode: 'light' | 'dark';
  setMode: (mode: 'light' | 'dark') => void;
}

// Generates both global store AND provider
const themeToolkit = createStoreToolkit<ThemeStore>((set) => ({
  mode: 'light', // Global default
  setMode: (mode) => set({ mode }),
}), { name: "Theme" });

export const { useResolvedValue: useTheme } = themeToolkit;
export const { Provider: ThemeProvider } = themeToolkit.provider;

Now consuming components do not need to care whether they are reading from the global store or a scoped provider instance:

function ThemedCard() {
  const mode = useTheme((state) => state.mode);
  return <div className={`card-${mode}`}>Smart Card</div>;
}

function App() {
  return (
    <div>
      {/* 🌍 1. Uses the global 'light' theme */}
      <ThemedCard /> 

      {/* 🏠 2. Overrides the state to 'dark' for this specific tree ONLY */}
      <ThemeProvider onStoreInit={(store) => store.getState().setMode('dark')}>
        <div className="dark-zone">
          <ThemedCard />
        </div>
      </ThemeProvider>
    </div>
  );
}

This hybrid pattern is useful for reusable UI modules, nested widgets, or apps where most of the UI can share one store, but a subtree sometimes needs an isolated instance.

Why resolved hooks are interesting

This is probably the most opinionated part of the library.

Normally, when a component can run in two modes, you end up with one of these designs:

  • separate hooks for global and scoped usage
  • props that inject the store
  • branching logic scattered across the component tree

Resolved hooks collapse that decision into one place:

  • inside the matching provider, read the scoped store
  • outside it, read the global store

That can simplify component APIs a lot, especially in shared UI packages.

A good mental model

Think of createStoreToolkit as:

  1. a normal global Zustand store
  2. plus an optional scoped override mechanism
  3. plus consumer hooks that pick the nearest valid source

That framing is more accurate than thinking of it as “magic context Zustand”.

Where to be careful

Resolved hooks are convenient, but they are also a design choice. I would avoid them when:

  • the distinction between global and local state should be explicit in the component API
  • debugging would become ambiguous because a component may silently switch data sources
  • different teams own global and scoped behavior separately

In other words, resolved hooks are best when the fallback behavior is intentional, not surprising.

4. Middleware Without Losing Types

So far the value has been architectural. This section is more about preserving the normal Zustand experience.

Zustand middleware such as Redux DevTools, Persist, or SubscribeWithSelector still belongs in the store creator.

The useful part here is that the toolkit preserves the resulting store API types, so helpers like persist.rehydrate or selector-aware subscribe remain available on useStoreApi.

import { createShallowStore } from "@okyrychenko-dev/react-zustand-toolkit";
import { devtools, persist } from "zustand/middleware";

interface CartStore {
  items: string[];
  addItem: (item: string) => void;
}

// Middleware types are preserved on the store API.
const { useStore, useStoreApi } = createShallowStore<
  CartStore,
  [["zustand/persist", CartStore], ["zustand/devtools", never]]
>(
  persist(
    devtools(
      (set) => ({
        items: [],
        addItem: (item) => set((state) => ({ items: [...state.items, item] })),
      }),
      { name: "GlobalCartStore" }
    ),
    { name: "cart-storage" }
  )
);

// Mutator APIs stay typed.
useStoreApi.persist.rehydrate();
useStoreApi.devtools.cleanUp();

This does not mean the toolkit adds a custom DevTools layer for provider stores. If you want Redux DevTools, apply Zustand middleware in the creator itself. Dynamic provider instances are not auto-connected for you.

This is a subtle point, but an important one.

The library is not trying to compete with Zustand middleware. It is trying to stay out of the way while preserving the resulting types.

That is the right design choice. Middleware remains a Zustand concern, not a toolkit-specific abstraction.

5. Ready for React 19 ⚛️

This part is useful, but it should be read with the right expectations.

React 19 introduces hooks and rendering primitives such as Transitions, Action State, and Optimistic Updates.

@okyrychenko-dev/react-zustand-toolkit includes a few small utilities around those APIs. They are wrappers, not a new state model.

Wrapping Actions in Transitions

If you have an update that may trigger expensive rendering, you can wrap the action in a transition:

import { createTransitionAction } from "@okyrychenko-dev/react-zustand-toolkit";

const incrementInTransition = createTransitionAction(() => {
  // This update runs inside React.startTransition
  counterToolkit.useStoreApi.getState().increment();
});

Action State Adapters

If you want a thin adapter over useActionState for store-related async actions:

import { useActionStateAdapter } from "@okyrychenko-dev/react-zustand-toolkit";

function SaveForm() {
  const [status, submitForm, isPending] = useActionStateAdapter(
    async (payload: FormData) => {
      await myApi.save(payload);
      myStore.getState().markSaved();
      return "saved";
    }, 
    "idle"
  );

  return (
    <form action={submitForm}>
      <button disabled={isPending}>
        {isPending ? "Saving..." : "Save"}
      </button>
      {status === 'saved' && <p>Saved successfully!</p>}
    </form>
  );
}

Optimistic UI Updates

If you want an optimistic layer on top of committed Zustand state:

import { useOptimisticReducer } from "@okyrychenko-dev/react-zustand-toolkit";

function TodoList() {
  const serverTodos = useTodos((state) => state.todos);

  const [optimisticTodos, addOptimisticTodo] = useOptimisticReducer(
    serverTodos,
    (current, nextTodo) => [...current, nextTodo]
  );

  // ... render optimisticTodos instead of serverTodos
}

I would treat these helpers as convenience utilities, not the center of the package.

They are nice because they keep React 19-oriented code close to the same toolkit, but the core value of the library is still:

  • store scoping
  • shallow-first selectors
  • resolved hooks

That is where the architectural leverage really is.

6. Which Factory Should You Reach For?

If you only remember one section from this article, make it this one.

If you are evaluating the library quickly, this is the practical decision tree:

Use createShallowStore when:

  • you want one global singleton store
  • your main annoyance is repeated useShallow usage
  • you do not need isolated instances

Use createStoreProvider when:

  • every mounted subtree should own its own store
  • the state lifetime should end when that subtree unmounts
  • store isolation should be explicit

Use createStoreToolkit when:

  • you want a global store by default
  • some subtrees should be able to override it with local instances
  • your consumers should work in both environments with the same hook API

That separation is one of the better aspects of the package. The API is not trying to force one pattern onto every use case.

Quick Comparison

Factory Best for Store lifetime Main benefit
createShallowStore One global store App-wide Shallow-first selectors with low boilerplate
createStoreProvider Isolated subtree state Per provider instance Explicit store ownership and lifecycle
createStoreToolkit Mixed global + local override scenarios Global plus optional scoped instances Shared consumer API through resolved hooks

7. When You Probably Do Not Need This Library

This section matters because a good abstraction should come with a clear boundary.

It is also worth being explicit about the non-use-cases.

You probably do not need this toolkit if:

  • your app already works well with a single global Zustand store
  • you rarely select object or array bundles
  • you do not use scoped providers at all
  • you prefer explicit store injection over fallback resolution

There is no benefit in adding an abstraction layer just because it exists.

Good Zustand architecture is still mostly about picking the right ownership model for state. This toolkit simply makes a few of those models easier to implement consistently.

Wrapping Up

The strongest part of react-zustand-toolkit is not that it reinvents Zustand. It does not.

Its value is that it packages a few repeatable patterns into a small API:

  • generated providers and hooks for isolated store instances
  • shallow-first selector hooks with explicit plain alternatives
  • resolved hooks for code that should work both inside and outside a provider
  • typed passthrough for Zustand middleware
  • a few optional React 19 wrappers

If those are problems you keep solving by hand, the library is worth a look.

If your app only needs a single global store, plain Zustand may still be enough, and that is completely fine.

But if your real problem is no longer “how do I store state?” and has become “who owns this state, how many instances of it exist, and how should components resolve it?”, then this toolkit starts to become much more interesting.

Next Steps

Install it today:

npm install @okyrychenko-dev/react-zustand-toolkit zustand

Check out the full API reference, examples, and source code in the GitHub Repository.

If you have run into the “global store everywhere, until one subtree needs isolation” problem, this is the part of Zustand architecture the toolkit is trying to simplify.

bQuery.js 🥂 The jQuery for the Modern Web Platform

A deep-dive into the modular, zero-build frontend framework that bridges the gap between vanilla JavaScript and full-blown frameworks

Introduction

Remember jQuery? That legendary library that made DOM manipulation actually enjoyable back in the day? Well, times have changed, browsers became smarter, the web platform grew up, and build toolchains ballooned into something that requires a PhD to configure properly.

But here’s the thing: sometimes you just want to grab an element, wire up some reactive state, and get on with your life. No Vite config, no node_modules rabbit hole, no framework-specific mental model to internalize. Just… JavaScript. On the web. Like the good ol’ days, but modern.

That’s exactly where bQuery.js comes in.

bQuery (v1.7.0 as of this writing) describes itself as “the jQuery for the modern web platform” and it earns that title. It takes the directness and ergonomics of jQuery and layers on signals-based reactivity, async data composables, native Web Components, motion, forms, i18n, accessibility primitives, drag-and-drop, SSR, and a whole lot more. All of it modular. All of it progressively adoptable.

Let’s break it down.

Table of Contents

  1. What Is bQuery?
  2. Getting Started Zero Build, No Excuses
  3. The Core API Good Old DOM Manipulation
  4. Reactive Primitives Signals All the Way Down
  5. Async Data & Fetching
  6. Building Web Components with bQuery
  7. @bquery/ui The Default Component Library
  8. The Broader Ecosystem at a Glance
  9. When Should You Reach for bQuery?
  10. Conclusion

1. What Is bQuery?

bQuery is a modular JavaScript/TypeScript library published under @bquery/bquery on npm. Its philosophy can be summed up in three bullet points:

  • Zero build required works via CDN or ES modules straight in the browser; Vite is optional
  • Secure by default sanitized DOM operations and Trusted Types compatibility out of the box
  • Progressive import only what you need, add complexity only where you need it

The package is split into focused submodules so you never pay for what you don’t use:

Module What it does
core Selectors, DOM manipulation, events, utilities
reactive Signals, computed, effects, async composables
component Typed Web Components with Shadow DOM control
motion Transitions, FLIP, springs, parallax, typewriter
security Sanitization, Trusted Types, CSP helpers
platform Storage, cookies, cache, page meta, announcer
router SPA routing with guards and declarative links
store Signal-based global state with persistence
forms Reactive form state and validators
i18n Locale, translations, pluralization, Intl formatting
a11y Focus traps, skip links, live regions, media audits
dnd Draggable, drop zones, sortable lists
media Viewport, network, battery, clipboard wrappers
plugin Custom directive and component registration
devtools Signal/store/component inspection at runtime
testing Component mounts, mock signals, async assertions
ssr Server-side rendering with hydration

That’s a lot of ground covered and yet the entry point stays clean because you only import what you actually touch.

2. Getting Started Zero Build, No Excuses

The fastest way to try bQuery is dropping a <script type="module"> into an HTML file:

<!DOCTYPE html>
<html>
  <head>
    <title>bQuery Demo</title>
  </head>
  <body>
    <button id="counter">Count: 0</button>

    <script type="module">
      import { $, signal, effect } from 'https://unpkg.com/@bquery/bquery@1/dist/full.es.mjs';

      const count = signal(0);

      effect(() => {
        $('#counter').text(`Count: ${count.value}`);
      });

      $('#counter').on('click', () => {
        count.value++;
      });
    </script>
  </body>
</html>

No build step. No config. Reactive state, DOM manipulation, and event handling in ~10 lines. If that doesn’t put a smile on your face, I don’t know what will.

For project-based setups you install it the usual way:

npm install @bquery/bquery
# or
pnpm add @bquery/bquery
# or
bun add @bquery/bquery

And then import from the main entry point or directly from individual submodules:

// Everything from one place
import { $, signal, effect, component } from '@bquery/bquery';

// Or surgically pick submodules
import { $, $$ } from '@bquery/bquery/core';
import { signal, computed, effect, useFetch } from '@bquery/bquery/reactive';
import { component, html, registerDefaultComponents } from '@bquery/bquery/component';

3. The Core API Good Old DOM Manipulation

The core module is the jQuery-familiar part of bQuery. You get $ for single elements and $$ for collections. Both return wrapper objects with a chainable API.

import { $, $$ } from '@bquery/bquery/core';

// Single element throws if not found
$('#app')
  .addClass('loaded')
  .css({ color: 'rebeccapurple', fontSize: '1.2rem' })
  .text('Hello, bQuery!');

// Multiple elements
$$('.card').each((el) => {
  el.toggleClass('visible');
});

The single-element wrapper (BQueryElement) covers:

  • Class/attribute helpers: addClass, removeClass, toggleClass, attr, removeAttr, data, prop
  • Content: text, html (sanitized by default!), htmlUnsafe, append, prepend, before, after
  • Visibility: show, hide, toggle, css
  • Events: on, once, off, trigger, delegate
  • Traversal: find, closest, parent, children, siblings, next, prev
  • DOM manipulation: wrap, unwrap, replaceWith, detach, scrollTo
  • Form helpers: serialize, serializeString, val
  • Dimensions: rect, offset, outerWidth, outerHeight, position

Notice that html() is sanitized by default. That’s the “secure by default” principle in practice you have to explicitly call htmlUnsafe() to bypass it. A small thing that prevents a whole class of XSS bugs.

The core module also ships a solid utility belt:

import { debounce, throttle, merge, uid, utils } from '@bquery/bquery/core';

const save = debounce(() => console.log('saved!'), 300);
save.cancel(); // cancelable

const id = uid('component'); // "component-xyz123"

const merged = merge({ a: 1 }, { b: 2 }); // { a: 1, b: 2 }

Utilities include clone, pick, omit, slugify, truncate, chunk, flatten, compact, unique, randomInt, clamp, and a full suite of type guards (isString, isElement, isPromise, etc.). It’s the kind of utility layer that means you actually don’t need lodash.

4. Reactive Primitives Signals All the Way Down

This is where bQuery steps firmly into modern territory. The reactive module gives you fine-grained reactivity through signals the same primitive that’s now baked into Angular, Solid, and Preact signals.

import { signal, computed, effect, batch, watch } from '@bquery/bquery/reactive';

const firstName = signal('John');
const lastName = signal('Doe');

// Computed values are lazy and cached
const fullName = computed(() => `${firstName.value} ${lastName.value}`);

// Effects run immediately, re-run on dependency change
effect(() => {
  document.title = fullName.value;
});

// Batch multiple updates into a single notification pass
batch(() => {
  firstName.value = 'Jane';
  lastName.value = 'Smith';
});

// Watch with old/new value comparison
const stop = watch(firstName, (newVal, oldVal) => {
  console.log(`Changed: ${oldVal} → ${newVal}`);
});

stop(); // unsubscribe

A few things worth highlighting:

signal.peek() reads the value without creating a reactive dependency. Useful when you need to read inside an effect without it re-subscribing.

signal.update(fn) updates based on the current value handy for immutable patterns.

signal.dispose() removes all subscribers and prevents memory leaks. Important for long-lived apps.

readonly(signal) creates a read-only view. Great for exposing reactive state from a store without allowing external mutation.

untrack(() => ...) reads signals inside an effect without tracking them as dependencies.

persistedSignal syncs a signal to localStorage automatically, with graceful fallbacks for SSR and Safari private mode:

import { persistedSignal } from '@bquery/bquery/reactive';

const theme = persistedSignal('theme', 'light');
theme.value = 'dark'; // Saved to localStorage automatically

linkedSignal creates a writable computed you provide both a getter and a setter, so writes can fan out to multiple underlying signals:

import { linkedSignal, signal } from '@bquery/bquery/reactive';

const first = signal('Ada');
const last = signal('Lovelace');

const fullName = linkedSignal(
  () => `${first.value} ${last.value}`,
  (next) => {
    const [nextFirst, nextLast] = next.split(' ');
    first.value = nextFirst ?? '';
    last.value = nextLast ?? '';
  }
);

fullName.value = 'Grace Hopper'; // Fans out to first and last

Errors inside effects are caught and logged rather than crashing the reactive system subsequent updates keep working. That’s a nice resilience property you don’t always get for free.

5. Async Data & Fetching

Managing loading states, errors, and async lifecycles is boilerplate-heavy in vanilla JS. bQuery abstracts all of that into two composables.

useAsyncData wraps any async function in a signal-based lifecycle:

import { signal, useAsyncData } from '@bquery/bquery/reactive';

const userId = signal(1);
const user = useAsyncData(
  () => fetch(`/api/users/${userId.value}`).then(r => r.json()),
  {
    watch: [userId],        // re-run when userId changes
    defaultValue: null,
    onError: (err) => console.error('Failed:', err),
  }
);

// Reactive state you can bind directly to the DOM
console.log(user.status.value);  // 'idle' | 'pending' | 'success' | 'error'
console.log(user.pending.value); // boolean
console.log(user.data.value);    // the resolved data
console.log(user.error.value);   // Error | null

await user.refresh(); // manually trigger
user.clear();         // reset everything
user.dispose();       // stop watchers

useFetch builds on top of that and adds HTTP niceties base URLs, query params, custom headers, automatic JSON serialization, and pluggable response parsers (json, text, blob, arrayBuffer, formData, response):

import { useFetch } from '@bquery/bquery/reactive';

const users = useFetch('/users', {
  baseUrl: 'https://api.example.com',
  query: { page: 1, include: 'profile' },
  headers: { authorization: 'Bearer my-token' },
});

For shared defaults across multiple requests, createUseFetch acts as a factory:

import { createUseFetch } from '@bquery/bquery/reactive';

const useApi = createUseFetch({
  baseUrl: 'https://api.example.com',
  headers: { 'x-client': 'my-app' },
});

const profile = useApi('/profile');
const posts = useApi('/posts', { query: { page: 2 } });

This pattern is really clean for larger apps where you want a pre-configured fetch instance rather than repeating base URLs everywhere.

6. Building Web Components with bQuery

The component module is where bQuery really shines for component-driven architectures. It wraps the native Custom Elements API with typed props, optional internal state, scoped reactivity, and a sanitized render function.

import { component, html, bool } from '@bquery/bquery/component';

component('user-card', {
  props: {
    username: { type: String, required: true },
    avatar: { type: String, default: '/default-avatar.png' },
    active: { type: Boolean, default: false },
  },
  state: { clicks: 0 },
  styles: `
    .card { display: grid; gap: 0.5rem; padding: 1rem; border-radius: 8px; }
    .active { border: 2px solid #4f46e5; }
  `,
  connected() {
    console.log('user-card mounted');
  },
  disconnected() {
    console.log('user-card removed');
  },
  render({ props, state, emit }) {
    return html`
      <button
        class="card ${props.active ? 'active' : ''}"
        ${bool('disabled', !props.active)}
        @click=${() => {
          this.setState('clicks', state.clicks + 1);
          emit('card-clicked', { username: props.username });
        }}
      >
        <img src="${props.avatar}" alt="${props.username}" />
        <strong>${props.username}</strong>
        <span>Clicked ${state.clicks} times</span>
      </button>
    `;
  },
});
<!-- Usage -->
<user-card username="Jonas" active></user-card>

A few things to appreciate here:

Props are typed and coerced automatically. Strings stay strings, numbers get Number() called on them, booleans understand 'true', '', '1', 'false', '0'. Objects get JSON.parsed. You can also add a validator function to enforce invariants at runtime.

The render output is sanitized before being written to the Shadow DOM. You get security by default with an explicit opt-in mechanism (safeHtml, trusted) when you need to pass sanitized fragments through.

Shadow DOM mode is configurable. Open shadow root by default, but you can go closed or render directly into light DOM:

component('inline-banner', {
  shadow: false, // renders in light DOM
  render: () => html`<p class="banner">No shadow needed here</p>`,
});

Lifecycle hooks cover everything you need: beforeMount, connected, beforeUpdate (return false to cancel a re-render), updated, disconnected, onError, onAdopted, and onAttributeChanged.

Scoped reactive helpers (useSignal, useComputed, useEffect) create component-local reactive resources that are cleaned up automatically on disconnect no manual cleanup needed:

component('live-timer', {
  state: { seconds: 0 },
  connected() {
    const tick = useSignal(0);
    const interval = setInterval(() => tick.value++, 1000);

    useEffect(() => {
      this.setState('seconds', tick.value);
    });

    this.disconnected = () => clearInterval(interval);
  },
  render({ state }) {
    return html`<p>Elapsed: ${state.seconds}s</p>`;
  },
});

External signals can drive re-renders via the signals option, keeping component updates predictable:

import { signal, computed } from '@bquery/bquery/reactive';

const theme = signal<'light' | 'dark'>('light');
const themeClass = computed(() => `theme-${theme.value}`);

component('theme-badge', {
  props: {},
  signals: { themeClass },
  render({ signals }) {
    return html`<span class="${signals.themeClass.value}">Current theme</span>`;
  },
});

7. @bquery/ui The Default Component Library

bQuery ships a companion component library that’s registered through registerDefaultComponents(). It’s a small, zero-dependency set of native UI primitives no external CSS framework required.

import { defineBqueryConfig } from '@bquery/bquery/platform';
import { registerDefaultComponents } from '@bquery/bquery/component';

// Configure a custom prefix (default is 'ui')
defineBqueryConfig({
  components: { prefix: 'ui' },
  fetch: { baseUrl: 'https://api.example.com' },
  transitions: { skipOnReducedMotion: true },
});

const tags = registerDefaultComponents();

console.log(tags);
// {
//   button: 'ui-button',
//   card: 'ui-card',
//   input: 'ui-input',
//   textarea: 'ui-textarea',
//   checkbox: 'ui-checkbox'
// }

The available primitives:

  • ui-button pill-shaped button with variant, size, type, and disabled props
  • ui-card container with optional title, footer, and elevated props
  • ui-input labeled text input that emits input events with { value }
  • ui-textarea labeled textarea, same event contract as ui-input
  • ui-checkbox labeled checkbox that emits change events with { checked }

These components use regular HTML slots and bubble custom events, so they play nicely with forms, routers, and shadow DOM boundaries. You can compose them directly in your markup:

<ui-card title="Create Account">
  <ui-input label="Name"></ui-input>
  <ui-input label="Email"></ui-input>
  <ui-checkbox label="Accept terms"></ui-checkbox>
  <ui-button variant="primary" type="submit">Sign Up</ui-button>
</ui-card>

And wire them up reactively:

import { $, signal } from '@bquery/bquery';

const name = signal('');
const email = signal('');

document.querySelector('ui-input[label="Name"]')
  ?.addEventListener('input', (e) => {
    name.value = (e as CustomEvent<{ value: string }>).detail.value;
  });

The prefix system via defineBqueryConfig is a nice touch for teams with strict naming conventions, or for avoiding collisions when integrating bQuery components into an existing design system.

8. The Broader Ecosystem at a Glance

bQuery v1.7.0 covers a surprising amount of ground beyond what we’ve walked through. Here’s a quick tour of the other modules:

Router full SPA routing with constrained params, guards, redirects, and declarative <bq-link> elements.

Store signal-based global state with persistence, migrations, and action lifecycle hooks. Think a lightweight Pinia, but framework-agnostic.

View declarative bq-* attribute directives (bq-text, bq-show, bq-class, etc.) for template-style binding without a full component.

Motion transitions, FLIP animations, springs, parallax, typewriter effects, and scroll-linked animations. Respects prefers-reduced-motion by default when configured.

i18n reactive locale state, nested translation keys, pluralization rules, and Intl-based date/number/relative-time formatting.

a11y focus traps, skip navigation links, live region announcers, and audit helpers that flag missing ARIA attributes at runtime.

DnD make any element draggable, define drop zones, build sortable lists without reaching for a third-party library.

Testing renderComponent, fireEvent, waitFor utilities that mirror what you’d expect from Testing Library.

SSR renderToString for server-side HTML generation and hydrateMount for seamlessly picking up where the server left off.

Storybook helpers storyHtml() and when() for writing safe Storybook stories with boolean attribute shorthand (?disabled=${true}).

9. When Should You Reach for bQuery?

bQuery isn’t trying to replace React, Vue, or Svelte for large-scale applications with complex component trees and heavy state management. It’s solving a different problem.

Reach for bQuery when:

  • You want reactivity and component primitives without a build pipeline prototypes, experiments, browser extensions, internal tools
  • You’re writing vanilla JS/TS and want jQuery’s ergonomics plus modern signal-based reactivity
  • You need native Web Components with typed props and a sane lifecycle, but don’t want to set up Lit or Stencil
  • You’re building progressively enhanced pages where a CDN import is all you need
  • You want to ship accessible, secure-by-default UI without bolting on extra libraries for sanitization, focus management, and ARIA
  • You’re working on a small-to-medium project where a full SPA framework would be overkill

It’s also genuinely useful as a companion in larger apps you could use bQuery’s reactive core alongside an existing codebase for specific interactive islands without committing to a full rewrite.

10. Conclusion

bQuery v1.7.0 is one of those rare libraries that manages to feel both nostalgic and completely modern at the same time. It channels the simplicity of jQuery while embracing everything the web platform has become signals, Web Components, Trusted Types, fetch, Shadow DOM, the whole lot.

The zero-build path alone makes it worth knowing about. Being able to drop a single CDN import into an HTML file and immediately have signals, reactive DOM manipulation, typed components, and async data composables is genuinely impressive.

If you’ve been eyeing signals-based reactivity but felt the existing frameworks were too opinionated or too heavyweight for your use case, bQuery is absolutely worth exploring.

Give it a spin:

  • 📦 npm: @bquery/bquery
  • 📖 Docs: bquery.flausch-code.de

Thanks for reading! If you have questions or feedback, drop them in the comments. And if you’re using bQuery in a project, I’d love to hear about it.

Run Any HuggingFace Model on TPUs: A Beginner’s Guide to TorchAX

TorchAX

What if you could run any HuggingFace model on TPUs — without rewriting a single line of model code?

Here is what the end result looks like:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("google/gemma-3-1b-it", torch_dtype="bfloat16")

import torchax
torchax.enable_globally()  # Enable AFTER loading the model

model.to("jax")  # That's it. Now running on JAX.

Five lines. Your PyTorch model is now executing on JAX — with access to TPUs, JIT compilation, and automatic parallelism across devices.

In this tutorial, we will go from zero to building a working chatbot powered by a HuggingFace model running on JAX. Along the way, you will learn key JAX concepts, see real benchmarks, and understand why this approach exists.

Open Full Tutorial In Colab Open Quick Start In Colab

Why This Matters: The HuggingFace + JAX Problem

In 2024, HuggingFace removed native JAX and TensorFlow support from its transformers library to focus development on PyTorch. This left thousands of JAX users — especially those running on Google Cloud TPUs — without a straightforward way to use HuggingFace’s massive model collection.

What is JAX?

If you are new to JAX, think of it as Google’s high-performance numerical computing library. It looks like NumPy on the surface, but under the hood it offers three powerful capabilities:

  1. JIT Compilation — JAX can compile your Python code into optimized machine code using the XLA compiler. The first run is slower (compilation), but every subsequent call is dramatically faster.

  2. TPU Support — JAX is the native programming model for Google’s Tensor Processing Units. If you want to use TPUs, JAX is the most natural path.

  3. Automatic Parallelism — JAX can automatically distribute computation across multiple devices (TPUs or GPUs) using a single-program model called gSPMD. You describe what should be sharded; the compiler figures out how.

Enter TorchAX

torchax is a library from Google that bridges PyTorch and JAX. It works by creating a special torch.Tensor subclass that secretly holds a jax.Array inside. When PyTorch operations are called on this tensor, torchax intercepts them and executes the JAX equivalent instead.

Think of it like a Trojan horse: PyTorch thinks it is working with regular tensors, but the computation is actually happening on JAX.

PyTorch Model
    |
    v
torchax.Tensor (looks like torch.Tensor)
    |
    v
jax.Array (actual computation on TPU/GPU)

This means you can take any PyTorch model — including HuggingFace models — and run it on JAX without modifying the model code at all.

Credits: This tutorial builds on the excellent 3-part blog series by Han Qi (@qihqi), the author of torchax, and on the torchax documentation. We expand on those tutorials with beginner-friendly explanations, a different model (Gemma instead of Llama), benchmarks, and a complete Colab-ready notebook.

TorchAX vs. the Alternatives

Before diving into code, it helps to understand where torchax fits in the broader ecosystem:

Approach Effort Performance Best For
Rewrite in Flax/Equinox High (full rewrite) Native JAX speed New projects starting in JAX
torch-xla (PyTorch/XLA) Low (add XLA device) Good (XLA compiled) PyTorch training on TPUs
torchax Low (change device to ‘jax’) Great (JAX JIT + interop) Running HF models on JAX, mixing PyTorch + JAX
ONNX export Medium (export + runtime) Variable Cross-framework deployment

When should you use torchax? When you have a PyTorch model (especially from HuggingFace) and want to leverage JAX’s JIT compilation, TPU support, or interop with JAX libraries — without rewriting the model.

Prerequisites & Setup

What you need:

  • Python 3.10+
  • Basic familiarity with PyTorch (loading models, running inference)
  • A Google Colab account (free tier works for the 1B model)

Zero-setup option: Click the Colab badge above. The notebook handles all installation automatically.

Local setup:

# 1. Install PyTorch (CPU version — torchax handles the accelerator)
pip install torch --index-url https://download.pytorch.org/whl/cpu  # Linux
# pip install torch  # macOS

# 2. Install JAX for your accelerator
pip install -U jax[tpu]     # Google Cloud TPU
# pip install -U jax[cuda12]  # NVIDIA GPU
# pip install -U jax          # CPU only

# 3. Install torchax, transformers, and flax (for JAX compatibility)
pip install -U torchax transformers flax

Key Concepts for Beginners

Before we write code, let’s demystify three JAX concepts you will encounter throughout this tutorial.

Pytrees: JAX’s Data Containers

A pytree is any nested structure of Python containers (dicts, lists, tuples) with arrays as leaves. JAX uses pytrees everywhere — model weights are pytrees, function inputs/outputs are pytrees.

Think of a pytree like a shipping box with labeled compartments. JAX knows how to open standard boxes (dicts, lists, tuples), pull out all the arrays, do math on them, and put them back.

The catch: JAX does not know how to open custom boxes. HuggingFace defines custom output types like CausalLMOutputWithPast — we need to teach JAX how to unpack and repack these. This is called pytree registration, and we will see it in action shortly.

JIT Compilation: Translate Once, Run Fast Forever

JIT (Just-In-Time) compilation is like translating a recipe from English to machine code. The first time you call a JIT-compiled function, JAX traces through it, records all the operations, and compiles an optimized version. Subsequent calls skip the tracing and run the compiled version directly.

First call:  Python code → trace → compile → execute  (slow)
Second call: compiled code → execute                   (fast!)

The speedup can be 10-100x or more. The trade-off is that the compiled function is specialized for the input shapes it was traced with — if shapes change, JAX recompiles.

Static vs. Dynamic Values

When JAX traces a function for JIT, it treats inputs as abstract shapes, not concrete values. If your code has a branch like if use_cache:, JAX cannot evaluate it during tracing because use_cache is abstract. This causes a ConcretizationTypeError.

The fix: mark such values as static (compile-time constants) so JAX knows their actual value during tracing. We will see two ways to do this: closures and static_argnums.

Step 1: Your First Forward Pass

Let’s load a model and run it on JAX. We will use Gemma 3 1B IT — a small, instruction-tuned model from Google that runs comfortably on free Colab hardware.

import torch
import torchax
import jax
import time

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_name = "google/gemma-3-1b-it"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name, torch_dtype=torch.bfloat16, device_map="cpu"
)

# Enable torchax globally AFTER model loading
# This prevents intercepting unsupported initialization ops
torchax.enable_globally()

# Move model weights to the JAX device
model.to("jax")

# Tokenize an input prompt
prompt = "The secret to baking a good cake is"
inputs = tokenizer(prompt, return_tensors="pt")
input_ids = inputs["input_ids"].to("jax")

# Run a forward pass (eager mode)
start = time.perf_counter()
with torch.no_grad():
    outputs = model(input_ids, use_cache=False)
elapsed = time.perf_counter() - start

print(f"Output logits shape: {outputs.logits.shape}")
print(f"Eager forward pass: {elapsed:.3f}s")

What happened:

  1. We load the model on CPU first, then call torchax.enable_globally(). This ordering is important — enabling torchax before model loading can intercept unsupported initialization ops and cause errors.
  2. model.to("jax") moves every parameter from CPU to the JAX device — just like model.to("cuda") for GPUs.
  3. The forward pass runs through PyTorch’s code path, but every operation is executed by JAX under the hood.

The output logits tensor has shape (1, sequence_length, vocab_size). Each position contains a score for every token in the vocabulary — the highest score is the model’s prediction for the next token.

Step 2: Speed It Up with JIT Compilation

The eager forward pass works, but it is slow — every operation goes through Python one at a time. Let’s compile the model for dramatically faster inference.

The extract_jax Approach

The torchax.extract_jax() function converts a PyTorch model into a pure JAX function:

# Extract a JAX-callable function and the model weights as a pytree
weights, jax_func = torchax.extract_jax(model)

This returns two things:

  • weights — the model’s state_dict as a pytree of jax.Arrays
  • jax_func — a function with signature jax_func(weights, args_tuple, kwargs_dict)

Register HuggingFace Output Types as Pytrees

Before we can JIT this function, we need to teach JAX about HuggingFace’s custom types:

from jax.tree_util import register_pytree_node
from transformers import modeling_outputs, cache_utils

# Register CausalLMOutputWithPast
def output_flatten(v):
    return v.to_tuple(), None

def output_unflatten(aux, children):
    return modeling_outputs.CausalLMOutputWithPast(*children)

register_pytree_node(
    modeling_outputs.CausalLMOutputWithPast,
    output_flatten,
    output_unflatten,
)

# Register DynamicCache
def _flatten_dynamic_cache(cache):
    return (cache.key_cache, cache.value_cache), None

def _unflatten_dynamic_cache(aux, children):
    c = cache_utils.DynamicCache()
    c.key_cache, c.value_cache = children
    return c

register_pytree_node(
    cache_utils.DynamicCache,
    _flatten_dynamic_cache,
    _unflatten_dynamic_cache,
)

Handle Static Arguments with a Closure

The use_cache flag is a boolean that JAX cannot trace. We wrap it in a closure to make it a compile-time constant:

def forward_no_cache(weights, input_ids):
    return jax_func(weights, (input_ids,), {"use_cache": False})

jitted_forward = jax.jit(forward_no_cache)

Benchmark: Eager vs. JIT

# Convert input to a native JAX array for jax.jit
jax_input_ids = jax.device_put(inputs["input_ids"].numpy())

# Warm up (first call triggers compilation)
res = jitted_forward(weights, jax_input_ids)
jax.block_until_ready(res)

# Benchmark 3 runs
for i in range(3):
    start = time.perf_counter()
    res = jitted_forward(weights, jax_input_ids)
    jax.block_until_ready(res)
    elapsed = time.perf_counter() - start
    print(f"Run {i}: {elapsed:.4f}s")

Expected output (times will vary by hardware):

Run 0: 0.0142s  # Already compiled from warm-up
Run 1: 0.0038s
Run 2: 0.0035s

The JIT-compiled version runs orders of magnitude faster than eager mode. This is the power of XLA compilation — operations are fused, memory is optimized, and the accelerator runs a single optimized program.

Step 3: The Simpler API — torchax.compile

The extract_jax + manual JIT approach gives you full control, but for most cases there is a simpler way. The catch is that torchax.compile() uses jax.jit under the hood, so we need to avoid passing dynamic boolean flags like use_cache. We wrap the model in a thin module that bakes in these constants:

import torch.nn as nn

class NoCacheModel(nn.Module):
    def __init__(self, base_model):
        super().__init__()
        self.base_model = base_model

    def forward(self, input_ids):
        # Return only logits to avoid HuggingFace output class pytree issues
        return self.base_model(input_ids, use_cache=False, return_dict=False)[0]

# One-liner: compile the wrapped model
compiled_model = torchax.compile(NoCacheModel(model))

# Use it like a normal PyTorch model
with torch.no_grad():
    logits = compiled_model(input_ids)

Under the hood, torchax.compile() wraps your model in a JittableModule and applies jax.jit. The first call triggers compilation; subsequent calls are fast. The NoCacheModel wrapper ensures that boolean flags are constants (not traced) and that the output is a plain tensor (not a custom HuggingFace type that needs pytree registration).

Step 4: Text Classification

Let’s use our JIT-compiled model for a practical task — sentiment classification. Since Gemma is an instruction-tuned model, we can use prompt engineering:

def classify_sentiment(text, model, tokenizer):
    prompt = f"""Classify the following text as POSITIVE or NEGATIVE.
Text: "{text}"
Classification:"""

    inputs = tokenizer(prompt, return_tensors="pt")
    input_ids = inputs["input_ids"].to("jax")

    with torch.no_grad():
        outputs = model(input_ids, use_cache=False)

    # Get the predicted next token
    next_token_logits = outputs.logits[0, -1, :]
    next_token_id = torch.argmax(next_token_logits).item()
    prediction = tokenizer.decode([next_token_id]).strip()
    return prediction

# Test it
texts = [
    "This movie was absolutely fantastic, I loved every minute!",
    "The service was terrible and the food was cold.",
    "A perfectly average experience, nothing special.",
]

for text in texts:
    result = classify_sentiment(text, model, tokenizer)
    print(f"Text: {text[:50]}...  =>  {result}")

Step 5: Text Generation (Autoregressive Decoding)

Classification is useful, but the real power of LLMs is generating text. Let’s understand how this works.

How Autoregressive Decoding Works

An LLM predicts one token at a time. Given an input of length n, it produces scores for the next token. We pick one (e.g., the highest-scoring token via greedy decoding), append it to the input, and repeat:

Iteration 1: input (1, n)     → output (1, n)     → pick token
Iteration 2: input (1, n+1)   → output (1, n+1)   → pick token
Iteration 3: input (1, n+2)   → output (1, n+2)   → pick token
...

The problem: input shapes change every iteration. JIT compilation specializes for fixed shapes, so changing shapes means recompilation every step — worse than eager mode.

The KV Cache Solution

The KV (Key-Value) cache stores intermediate computations from previous tokens so the model only needs to process the new token each iteration:

Iteration 1: input (1, n)              → output + kv_cache(n)
Iteration 2: input (1, 1) + cache(n)   → output + kv_cache(n+1)
Iteration 3: input (1, 1) + cache(n+1) → output + kv_cache(n+2)

With a DynamicCache, the cache grows each step — shapes still change. With a StaticCache, the cache has a fixed maximum length — shapes stay constant, making it JIT-friendly.

Implementation with StaticCache

from transformers.cache_utils import StaticCache

# Register StaticCache as a pytree
def _flatten_static_cache(cache):
    return (
        cache.key_cache, cache.value_cache
    ), (cache.config, cache.max_batch_size, cache.max_cache_len,
        getattr(cache, "device", None), getattr(cache, "dtype", None))

def _unflatten_static_cache(aux, children):
    config, max_batch_size, max_cache_len, device, dtype = aux
    kwargs = {}
    if device is not None: kwargs["device"] = device
    if dtype is not None: kwargs["dtype"] = dtype
    cache = StaticCache(config, max_batch_size, max_cache_len, **kwargs)
    cache.key_cache, cache.value_cache = children
    return cache

register_pytree_node(
    StaticCache,
    _flatten_static_cache,
    _unflatten_static_cache,
)

def generate_text(model, tokenizer, prompt, max_new_tokens=50):
    inputs = tokenizer(prompt, return_tensors="pt")
    input_ids = inputs["input_ids"].to("jax")
    batch_size, seq_length = input_ids.shape

    # Create a static cache with fixed maximum length
    past_key_values = StaticCache(
        config=model.config,
        max_batch_size=1,
        max_cache_len=seq_length + max_new_tokens,
        device="jax",
        dtype=model.dtype,
    )
    cache_position = torch.arange(seq_length, device="jax")

    # Prefill: process the full prompt
    with torch.no_grad():
        logits, past_key_values = model(
            input_ids,
            cache_position=cache_position,
            past_key_values=past_key_values,
            return_dict=False,
            use_cache=True,
        )

    next_token = torch.argmax(logits[:, -1], dim=-1)[:, None]
    generated_ids = [next_token[:, 0].item()]
    cache_position = torch.tensor([seq_length], device="jax")

    # Decode: generate one token at a time
    for _ in range(max_new_tokens - 1):
        with torch.no_grad():
            logits, past_key_values = model(
                next_token,
                cache_position=cache_position,
                past_key_values=past_key_values,
                return_dict=False,
                use_cache=True,
            )
        next_token = torch.argmax(logits[:, -1], dim=-1)[:, None]
        token_id = next_token[:, 0].item()

        if token_id == tokenizer.eos_token_id:
            break
        generated_ids.append(token_id)
        cache_position += 1

    return tokenizer.decode(generated_ids, skip_special_tokens=True)

# Generate!
result = generate_text(model, tokenizer, "The secret to baking a good cake is")
print(result)

Step 6: Distributed Inference (Tensor Parallelism)

If you have access to multiple devices (e.g., a TPU v2-8 with 8 chips, or multi-GPU), you can shard the model weights across devices for faster inference.

How Tensor Parallelism Works

In tensor parallelism, we split weight matrices across devices:

  • Column-parallel: Q, K, V, Gate, and Up projections are split along the output dimension
  • Row-parallel: O and Down projections are split along the input dimension
  • Between these two, only a single all-reduce operation is needed per layer

JAX’s gSPMD handles the communication automatically — you just specify how each weight should be sharded.

Sharding the Weights

from jax.sharding import PartitionSpec as P, NamedSharding

# Create a device mesh
mesh = jax.make_mesh((jax.device_count(),), ("axis",))

def shard_weights(mesh, weights):
    sharded = {}
    for name, tensor in weights.items():
        if any(k in name for k in ["q_proj", "k_proj", "v_proj", "gate_proj", "up_proj"]):
            spec = P("axis", None)  # Column-parallel
        elif any(k in name for k in ["o_proj", "down_proj", "lm_head", "embed_tokens"]):
            spec = P(None, "axis")  # Row-parallel
        else:
            spec = P()  # Replicate (e.g., layer norms)
        sharded[name] = jax.device_put(tensor, NamedSharding(mesh, spec))
    return sharded

# Apply sharding
weights, jax_func = torchax.extract_jax(model)
weights = shard_weights(mesh, weights)

# Replicate the input across all devices
input_ids_sharded = jax.device_put(
    inputs["input_ids"], NamedSharding(mesh, P())
)

With sharded weights, the same jax.jit-compiled function now runs in parallel across all devices. The XLA compiler automatically inserts the necessary all-reduce operations.

Note: Tensor parallelism requires a multi-device environment. On free Colab TPU (single device), this section is for illustration. Use a TPU v2-8 or multi-GPU setup to run it.

Step 7: Build a Mini Chatbot

Let’s wrap everything into a simple chat function using Gemma’s instruction template:

def chat(model, tokenizer, user_message, max_new_tokens=100):
    # Gemma instruction format
    prompt = f"<start_of_turn>usern{user_message}<end_of_turn>n<start_of_turn>modeln"
    response = generate_text(model, tokenizer, prompt, max_new_tokens)
    return response

# Example conversation
questions = [
    "What is JAX and why would I use it?",
    "Explain tensor parallelism in simple terms.",
    "Write a haiku about machine learning.",
]

for q in questions:
    print(f"User: {q}")
    print(f"Gemma: {chat(model, tokenizer, q)}")
    print()

Swapping to a Larger Model

Everything above uses google/gemma-3-1b-it (1B parameters). To use a larger model, change the model name:

# 7B model — needs more memory (Colab Pro or multi-device)
model_name = "google/gemma-3-7b-it"

The rest of the code remains identical. Larger models produce higher quality outputs but require more memory and compute. The 7B model benefits significantly from tensor parallelism on multi-device setups.

Other models that work well with torchax include any standard HuggingFace AutoModelForCausalLM architecture — GPT-2, Llama, Mistral, Phi, and more.

Troubleshooting

TypeError: ... is not a valid JAX type
You need to register the type as a pytree. See the registration examples above for CausalLMOutputWithPast, DynamicCache, and StaticCache.

ConcretizationTypeError: Abstract tracer value encountered
A value that changes between calls (like a boolean flag) needs to be either: (1) made static via static_argnums in jax.jit, or (2) baked into a closure as a constant.

UserWarning: A large amount of constants were captured
Model weights are being inlined as constants in the compiled graph. Pass them as explicit function arguments instead of closing over them.

RuntimeError: No available devices
Ensure JAX can see your accelerator: print(jax.devices()). In Colab, check that your runtime type is set to TPU or GPU.

Conclusion

In this tutorial, we went from zero to a working chatbot running a HuggingFace model on JAX:

  1. Forward pass — moved a PyTorch model to JAX with model.to("jax")
  2. JIT compilation — compiled for 10-100x speedup with jax.jit
  3. Text classification — used prompt engineering for sentiment analysis
  4. Text generation — implemented autoregressive decoding with StaticCache
  5. Distributed inference — sharded weights across devices with tensor parallelism
  6. Chatbot — wrapped generation in an instruction-following chat function

The key insight: torchax lets you use the entire HuggingFace ecosystem — models, tokenizers, configs — while running on JAX’s high-performance backend. No model rewrites needed.

Resources

  • torchax GitHub — library source and documentation
  • torchax Docs — official getting started guide
  • Original tutorial series by Han Qi — the 3-part blog series this tutorial builds on
  • JAX Documentation — JIT compilation, pytrees, distributed arrays
  • HuggingFace LLM Inference Optimization — StaticCache and torch.compile docs
  • Companion GitHub repo — all code, notebooks, and diagrams

Credits

This tutorial would not be possible without the work of:

  • Han Qi (@qihqi) — author of torchax and the original HuggingFace + JAX tutorial series
  • The torchax team at Google — for building and maintaining the library
  • The HuggingFace team — for the transformers ecosystem
  • The JAX team at Google — for JAX, XLA, and TPU support

What model will you try running on TPUs first? Let me know in the comments!