Tenant Schema Migrations

Finsta uses schema-per-tenant multitenancy. Each tenant has its own PostgreSQL schema with a private flyway_schema_history table. A row in global.tenant (in the global schema) tracks each tenant’s schema_version and a coarse state.

Tenants get migrated through two complementary paths: a request filter that migrates on demand, and a background orchestrator that drains pending migrations slowly across the fleet.

Components

tritt.finsta.service.tenant.TenantMigrationFilter — HTTP server filter, on every request that carries a tenant header. tritt.finsta.service.tenant.TenantMigrationOrchestrator@Singleton startup-triggered scheduled loop. tritt.finsta.service.tenant.TenantMigrations — in-memory coordination table mapping tenantId to in-flight TenantMigrationTuple, prevents duplicate work within a single JVM. tritt.finsta.service.tenant.TenantServiceBean.migrateTenant — entry point that grabs the tenant row, runs Flyway, releases the row. tritt.finsta.domain.tenant.SchemaRepository.migrateTenantSchema — thin wrapper around Flyway for a single tenant schema.

Per-tenant state

The global.tenant row has two fields that together encode migration state:

Column Meaning

schema_version

The Flyway version the tenant’s schema was last migrated to. Compared against SchemaRepository.getLatestVersion() to detect pending work.

state

null (or 'Pending' if never migrated) means available; 'Migrating' means an instance has claimed it; 'Failed' means the last attempt threw a FlywayException.

A tenant is "pending migration" when schema_version != latestVersion AND coalesce(state, 'Pending') = 'Pending'.

Two trigger paths

On-demand: TenantMigrationFilter

Fires on every HTTP request that carries a tenant header. Calls TenantMigrationOrchestrator.ensureMigratedTenant(tenantId), which delegates to TenantMigrations.get(…​) for in-JVM deduplication and then to TenantServiceBean.migrateTenant. The filter blocks for up to finsta.tenant-migration.request-filter-timeout (default 10s) waiting for the migration to complete. On timeout it returns 503 via the TenantMigrationFailed problem code.

Frontends must handle 503 responses from this path — first-after-deploy requests for a given tenant on a given instance can hit this filter.

The synchronized (tenantId) block in the filter coalesces concurrent requests for the same tenant on the same instance — only one of them does the work, the rest wait on the same TenantMigrationTuple.

Background: TenantMigrationOrchestrator

Started on StartupEvent. Repeats:

  1. Pick a random pending tenant via TenantRepository.findNextTenantWithPendingMigrations (a single SQL query against global.tenant, no Flyway involvement).

  2. Run TenantServiceBean.migrateTenant.

  3. Wait finsta.tenant-migration.delay-between-base ± delay-between-offset (default 10s, no jitter) before picking the next.

  4. Stop when no more pending tenants.

The delay is intentional load-shaping, not a workaround for slowness — the orchestrator should drain the backlog without competing with on-demand filter-triggered migrations for connections, CPU, or PG load.

Cross-instance coordination

Multiple finsta instances run the orchestrator and serve filter-path migrations concurrently. There is no central coordinator; coordination happens through three layered mechanisms:

  1. Application-level row claimTenantServiceBean.grabTenant UPDATEs state = 'Migrating' on the global.tenant row. The row has a Hibernate @Version column, so two instances trying to grab the same tenant fight an optimistic lock — one wins, the other gets OptimisticLockException and skips it (returns Optional.empty(), the orchestrator picks a different random tenant on the next tick).

  2. PostgreSQL advisory lock per schema — Flyway acquires pg_advisory_lock keyed on the qualified <schema>.flyway_schema_history table. Different tenant schemas → different lock keys → no cross-tenant blocking. Two instances trying to migrate the same schema simultaneously serialize on this lock as a second line of defense.

  3. In-JVM dedupTenantMigrations keeps a map of in-flight migrations so concurrent callers in the same JVM share the result.

TenantServiceBean.releaseTenant clears the Migrating state and writes the new schema_version.

Flyway lock mode (transactional vs session)

SchemaRepository.tenantFluentConfiguration sets flyway.postgresql.transactional.lock=false. This switches Flyway’s coordination lock from a transaction-scoped pg_advisory_xact_lock to a session-scoped pg_advisory_lock.

The mutual-exclusion guarantee is identical. The reason for the override is that the default holds the lock connection in idle in transaction for the entire migration duration, pinning the cluster-wide xmin horizon and stalling autovacuum on shared catalog tables (pg_class, pg_attribute) and on the global tenant row. At scale (thousands of tenant schemas + concurrent migrations across instances) the resulting bloat materially slows every subsequent migration.

The trade is that a hung-but-alive JVM no longer gets its lock auto-released by idle_in_transaction_session_timeout — see the runbook entry for diagnosis and recovery.

Performance at scale

Known bottleneck areas, in rough order of impact at high tenant count:

  1. PostgreSQL catalog cache invalidation (sinval) — every DDL statement issued by any backend on the cluster broadcasts an invalidation message to all other backends, which then evict and reload affected catalog cache entries. With thousands of tenant schemas (each contributing many pg_class, pg_attribute, pg_index rows) and concurrent migrations across instances, sinval pressure scales with both schema count and instance count. This presents as per-tenant migration time growing as instances are added — adding instances does not linearly increase total migration throughput.

  2. information_schema queriesSchemaRepository.discoverTenantSchemas and schemaExists use information_schema.schemata, an unindexed view over pg_namespace. Filtering it scans a catalog whose size grows with total schema count. Replace with pg_catalog.pg_namespace queries when the cost shows up in profiles.

  3. Per-tenant Flyway init in the request-filter pathensureMigratedTenant calls into flyway.info() on every fresh tenant request per JVM lifetime. The orchestrator’s pending-tenant query already avoids Flyway; the filter path does not. A small in-JVM cache of "last known up-to-date version per tenant" would eliminate redundant Flyway init for the common case (tenant already at latest).

See issue 1222 and the investigation plan at plans/2026-05-07_1222__flyway-transactional-lock-multitenant-investigation.md for the open performance work.

Configuration

finsta:
  tenant-migration:
    enable-scheduled: true              # turn the background orchestrator on/off
    delay-between-base: 10s             # base wait between scheduled migrations
    delay-between-offset: 0s            # ± random jitter on the base
    migration-timeout: 2m               # bound for waiting on a single migration
    request-filter-timeout: 10s         # filter-path 503 cutoff