Typesafe IndexedDB migrations

I feel a certain cognitive dissonance when working with IndexedDB in Typescript.

On the one hand, Typescript provides all the tools needed to model data and its transformations with incredible precision and fidelity. But it’s frozen in time, and the schema only models the latest or “final” version of the database.

Database migrations are difficult

One of the first realities IDB forces you to confront is the existence of multiple versions of the database at different points in time. You’ve got the latest version of the database you’re working on in dev, then the one your users are running that was pushed to prod last week, and maybe even a stale copy from several versions back from a client that hasn’t visited the site in a few months. This concept of version migrations is core to IDB’s design, and the API forces you to deal with it from the moment you first open the database, when an upgradeneeded event will fire so you can work through the process of bringing an older DB version up to data with the canonical version that your codebase will be dealing with (the one that matches the Typescript schema you declared). That might involve adding or removing object stores, adding new fields to or renaming existing stores, or running custom migration code to transform the structure of data from one version to the next.

Upgrade needed

The problem is that the actual migration code is a second source of truth about the database which can easily drift from the schema. If you add a new required field to an object store but you forget to add a default or fallback to old rows that don’t have it yet, the schema will claim that the property is always there. But old data in the database won’t match that assumption, potentially causing null pointer exceptions at runtime in places where Typescript claims it’s impossible. There’s a divergence between what the schema says and what the migrations say.

The core problem is that the schema defined on the database client isn’t much better than one big unchecked typecast.

Native imperative API

If you haven’t had the pleasure, IDB’s native API is imperative and object-oriented in style. It predates async and promises in Javascript, so it also uses callbacks pretty heavily.

const request = indexedDB.open("mydb", 1)

request.onupgradeneeded = (event) => {
  const db = request.result
  if (event.oldVersion < 1) {
    db.createObjectStore("users", { keyPath: "id" })
  }
}

Typesript isn’t able to statically analyze the resulting sequence of method calls in a way that narrows or infers type information. Every statement could potentially have unknown side effects or interact with the prototype chain in unpredictable ways, not to mention the potential effects of branching and control flow around the statements, so it’s not really safe to make any assumptions about it at compile time.

What would it look like if we were able to maintain that typesafety through every successive version of the database, and get compile-time checking for whether declared schema matches what the migrations say? Or even infer the final schema directly from the migrations themselves as the source of truth?

Chained builder API

The first step towards tackling this is to wrap IDB’s native API in something that’s more conducive to getting rich type information to flow across versions of the database.

import { createMigrations, schema } from "idb-builder"

const migrations = createMigrations().version(1, (v) =>
  v.createObjectStore({
    name: "users",
    primaryKey: "id",
  }),
)

At the top level, there’s a sequence of chained .version() calls. The return type of one version is carried into the method call of the next version.

Within versions, the same pattern is used: a version builder (the v param) is chained with method calls which carry granular type information through their return value: exactly what object stores have been added, their names, their configuration of primary keys, and more.

Under the hood, the API translates directly to native, imperative IDB calls (in this case, IDBDatabase.createObjectStore) without any changes in behavior. It’s just a different way of expressing the same thing which is more introspectable by the Typescript compiler.

Explicit vs. inferred generic parameters

The next piece of the puzzle is how to declare a schema for each object store at the time of its creation.

This is where we’d typically reach for an explicit generic type parameter:

createObjectStore<{ myField: string }>({ name: "myStore" })
// ❌ Cannot infer one generic while another is provided explicitly

The problem is that we’re already binding a generic type to infer and capture the literal types passed as parameters, such as object store name, primary key, etc.

In a perfect world we’d use partial inference to require that the first generic is provided by the caller while the second one is inferred from the parameters passed. Unfortunately, inference in Typescript is all-or-nothing, so we have to either infer all params or explictly provide all. This either kills the DX by requiring the programmer to repeat information twice, or cripples the system’s ability to capture the information it needs about the database.

On the other hand, moving all these settings into the generic type parameter would make these values inaccessible at runtime, when we need to apply them while executing the database migrations. That means a unified approach like this also doesn’t work:

createObjectStore({ name: "myStore", schema: { myField: string } })
// ❌ Store name only exists at compile-time

Phantom type helper

We need to introduce a second function call as a kind of “slot” to insert explicit type information into.

createObjectStore({ name: "myStore", schema: type<{ myField: string }>() })
// ✅ Schema is captured by a phantom type helper

This type helper is a little bit of trickery: its type signature will claim to return a value matching the generic type param, but in reality, it’s actually a no-op and never gets used. It exists solely to pass type information to the createObjectStore call.

It’s not perfect in the sense that it’s a tiny piece of runtime semantics cleverness that a first-time user might not understand the purpose of. But it’s still relatively readable at a glance.

Schema update deltas

The next challenge is how to handle modifying an object store in subsequent migrations. This also isn’t trivial; it’s essentially a function that takes a type as an input and returns a modified type as a result. This also can be modelled as a Typescript generic:

type NewSchema<OldSchema> = OldSchema & { myNewField: string }

But again, it’s awkward to insert into a chained builder in a way that automatically binds the OldSchema based on where the migration is applied in the chain.

The solution is to provide just a type “delta” which is deeply merged into the store’s type at time of invocation. So it looks like:

v.createObjectStore({ name: "myStore", schema: type<{ myField: string }>() }).updateStoreSchema({
  name: "myStore",
  modifySchema: type<{ myNewField: string }>(),
})
// final store schema: `{ myField: string, myNewField: string }`

Fields can be removed by setting them to never
The merging behavior happens recursively at each level of nesting
If the final schema after updates is not backwards-compatible, we can require at compile-time that a backfill or transform function is provided

Git repo

Check out the full repo here using all the techniques introduced in this article:

nathanbabcock/idb-builder