Database

PostgreSQL database connection and pool configuration.

Collation guidance:

Recommended (multi-language prod): ICU collation “und-x-icu” for consistent

ordering across platforms. Use TEMPLATE template0 to guarantee your collation/ctype apply (template1 may carry different locale or extensions).

CREATE DATABASE "identity_scribe"
  TEMPLATE template0
  ENCODING = 'UTF8'
  LC_COLLATE = 'und-x-icu'
  LC_CTYPE   = 'und-x-icu'
  CONNECTION LIMIT = -1;

Performance-focused / ASCII-only prod: choose “C” for maximum speed; keep

it consistent across all environments.

CREATE DATABASE "identity_scribe"
  TEMPLATE template0
  ENCODING = 'UTF8'
  LC_COLLATE = 'C'
  LC_CTYPE   = 'C'
  CONNECTION LIMIT = -1;

Connection pool architecture

Identity Scribe uses three separate connection pools to isolate different workloads:

Batch pool - Used by: Transcription tasks (main workload) - Size: Configured via maxPoolSize (default: concurrency + 5, max: concurrency * 1.5) - Each transcription task uses one connection for its lifetime
System pool - Used by: System-level operations (commit coordination, migrations, maintenance, entry preparation) - Size: Configurable via systemPoolSize (optional) - Default: max(transcribeCount + 4, concurrency / 4), clamped between 2 and maxPoolSize / 2 - Scales with both number of transcribes and concurrency level - Isolates system operations from transcription tasks
Channel pool - Used by: LDAP channel services (IdentityHub, LDAP channels) - Size: Configurable via channelPoolSize (optional) - Default: concurrency, minimum 2 - Can experience heavy LDAP traffic with multiple channels - Isolates channel operations from batch and commit workloads

Total maximum connections (when all pools are fully utilized): = Batch Pool + System Pool + Channel Pool = maxPoolSize + systemPoolSize + channelPoolSize

Example with default settings (concurrency = 16, 2 transcribes):

Batch Pool: 16 + 5 = 21 connections (default maxPoolSize)
System Pool: max(2 + 4, 16/4) = max(6, 4) = 6 connections
Channel Pool: 16 connections
Total Maximum: 21 + 6 + 16 = 43 connections

Example with maxPoolSize explicitly set to 50 (concurrency = 16, 2 transcribes):

Batch Pool: 50 connections
System Pool: max(2 + 4, 16/4) = 6 connections (capped at 50/2 = 25)
Channel Pool: 16 connections
Total Maximum: 50 + 6 + 16 = 72 connections

Tuning guidelines:

Monitor pool metrics: Track hikari.connections.active, hikari.connections.idle,

and hikari.connections.pending for each pool

Signs of under-sizing: If pending connections are consistently high or active equals

maximum for extended periods, consider increasing pool size

Signs of over-sizing: If idle connections consistently exceed 50% of maximum,

consider reducing pool size to free database resources

High concurrency workloads: Many more tasks can be queued than

concurrency, but each task still uses one connection. Monitor actual connection utilization rather than task count.

database.channel-pool-size

Size of the channel connection pool. This pool handles LDAP channel operations (IdentityHub, LDAP channels).

Default: concurrency

Minimum: 2

Channels can experience heavy LDAP traffic, especially with multiple active channels. Increase for high-throughput read-heavy workloads.

Example: With concurrency = 16:

Default = 16 connections

Priority: SCRIBE_DATABASE_CHANNEL_POOL_SIZE > config

Property	Value
Default	`16`
Override	`SCRIBE_DATABASE_CHANNEL_POOL_SIZE` (optional)

database.channel-pool-size = ${?SCRIBE_DATABASE_CHANNEL_POOL_SIZE}

database.connection-hints

Connection Hints (channel query defaults)

Session-level hints applied to channel queries (REST, LDAP, GraphQL, gRPC). All keys are unset by default (PostgreSQL defaults apply). Channels can override these defaults in their own connection-hints section. These hints control PostgreSQL session parameters for query execution. They are applied when opening a connection and restored when the query completes.

database.connection-hints.lock-timeout

Lock acquisition timeout.

PostgreSQL will abort any statement that waits longer than this for a lock. Format: HOCON duration (e.g., 5s, 30s, 1m)

Priority: SCRIBE_DATABASE_LOCK_TIMEOUT > config

Property	Value
Default	`null`
Override	`SCRIBE_DATABASE_LOCK_TIMEOUT` (optional)

database.connection-hints.lock-timeout = ${?SCRIBE_DATABASE_LOCK_TIMEOUT}

database.connection-hints.session-flags

Session flags (comma-separated).

Supported flags:

force-custom-plan: Forces PostgreSQL to use custom plans for prepared statements
jit-off: Disables JIT compilation

Priority: SCRIBE_DATABASE_SESSION_FLAGS > config

Property	Value
Default	`null`
Override	`SCRIBE_DATABASE_SESSION_FLAGS` (optional)

database.connection-hints.session-flags = ${?SCRIBE_DATABASE_SESSION_FLAGS}

database.connection-hints.statement-timeout

Statement execution timeout (prevents runaway queries). PostgreSQL will abort any statement that takes longer than this duration. Format: HOCON duration (e.g., 30s, 1m, 500ms)

Priority: SCRIBE_DATABASE_STATEMENT_TIMEOUT > config

Property	Value
Default	`null`
Override	`SCRIBE_DATABASE_STATEMENT_TIMEOUT` (optional)

database.connection-hints.statement-timeout = ${?SCRIBE_DATABASE_STATEMENT_TIMEOUT}

database.connection-hints.work-mem

PostgreSQL work_mem per-query memory allocation. Controls the amount of memory used for internal sort operations and hash tables. Format: PostgreSQL memory size (e.g., “64MB”, “256MB”, “1GB”) Valid units: B, kB, MB, GB, TB (case-insensitive)

Priority: SCRIBE_DATABASE_WORK_MEM > config

Property	Value
Default	`null`
Override	`SCRIBE_DATABASE_WORK_MEM` (optional)

database.connection-hints.work-mem = ${?SCRIBE_DATABASE_WORK_MEM}

database.maintenance

Maintenance scheduling for tasks like vacuuming, re-indexing, etc.

database.maintenance.cron

Define a cron expression (unix cron format) to run the maintenance task at a specific time Default not set, eg no cron window

Priority: SCRIBE_DATABASE_MAINTENANCE_CRON > config

cron = ‘0 5 * * *’ # 5am daily

Property	Value
Override	`SCRIBE_DATABASE_MAINTENANCE_CRON` (optional)

database.maintenance.cron = ${?SCRIBE_DATABASE_MAINTENANCE_CRON}

database.maintenance.enabled

If enabled, maintenance tasks will be scheduled Defaults to enabled unless readonly is set to true

Priority: SCRIBE_DATABASE_MAINTENANCE_ENABLED > config

Property	Value
Default	`true`
Override	`SCRIBE_DATABASE_MAINTENANCE_ENABLED` (optional)

database.maintenance.enabled = ${?SCRIBE_DATABASE_MAINTENANCE_ENABLED}

database.maintenance.hard-thresholds

Used after interval, to check if system is idle enough to run maintenance

Priority: SCRIBE_DATABASE_MAINTENANCE_HARD_THRESHOLDS > config

Defaults to:

hard-thresholds = {
scribe_db_connections_active = Math.ceil(Math.pow(Math.log(max-pool-size + channel-pool-size + system-pool-size), 2)) # DB pool load, eg many connections are active
}

Property	Value
Override	`SCRIBE_DATABASE_MAINTENANCE_HARD_THRESHOLDS` (optional)

database.maintenance.hard-thresholds = ${?SCRIBE_DATABASE_MAINTENANCE_HARD_THRESHOLDS}

database.maintenance.interval

Runs based on system load metrics, at least every 36 hours Only a well-known set of threshold metrics is supported (no labels/selectors):

scribe_ingest_tasks_active
scribe_ingest_task_pressure
scribe_ingest_queue_pressure
scribe_db_connections_active
scribe_ldap_connections_active
scribe_query_permit_pressure

Arbitrary metric names are NOT supported. These values are queried directly from ScribeMetrics (not Prometheus). The default value for interval is 36 hours, if no interval or cron is set

Priority: SCRIBE_DATABASE_MAINTENANCE_INTERVAL > config

Property	Value
Default	`36 hours`
Override	`SCRIBE_DATABASE_MAINTENANCE_INTERVAL` (optional)

database.maintenance.interval = ${?SCRIBE_DATABASE_MAINTENANCE_INTERVAL}

database.maintenance.soft-thresholds

Used after half of interval, to check if system is idle enough to run maintenance

Priority: SCRIBE_DATABASE_MAINTENANCE_SOFT_THRESHOLDS > config

Defaults to:

soft-thresholds = {
scribe_db_connections_active = Math.ceil(Math.log(max-pool-size + channel-pool-size + system-pool-size)) # DB pool load, eg not many connections are active
}

Property	Value
Override	`SCRIBE_DATABASE_MAINTENANCE_SOFT_THRESHOLDS` (optional)

database.maintenance.soft-thresholds = ${?SCRIBE_DATABASE_MAINTENANCE_SOFT_THRESHOLDS}

database.max-pool-size

Maximum size of the batch connection pool. This pool is used by transcription tasks - each task uses one connection for its lifetime.

Default: concurrency + 5

Minimum: 5

Maximum: concurrency * 1.5 (rounded up) When setting this value, remember that the total connection count includes:

This batch pool (max-pool-size)
System pool (configurable via system-pool-size, default: max(transcribeCount + 4, concurrency / 4))
Channel pool (configurable via channel-pool-size, default: concurrency)

Example: With max-pool-size = 50, concurrency = 16, and 2 transcribes:

Total max connections = 50 (batch) + 6 (system) + 16 (channel) = 72 connections

Priority: SCRIBE_DATABASE_MAX_POOL_SIZE > config

Property	Value
Default	`21`
Override	`SCRIBE_DATABASE_MAX_POOL_SIZE` (optional)

database.max-pool-size = ${?SCRIBE_DATABASE_MAX_POOL_SIZE}

database.password

Priority: SCRIBE_DATABASE_PASSWORD > config

Property	Value
Default	`""`
Override	`SCRIBE_DATABASE_PASSWORD` (optional)

database.password = ${?SCRIBE_DATABASE_PASSWORD}

database.prepare-threshold

Prepared Statement Caching (advanced tuning)

Controls server-side prepared statements and per-connection caching. These improve performance by reusing query plans across executions. Memory footprint: poolSize * prepared-statement-cache-size

Example: 35 connections * 8 MiB = ~280 MiB

prepare-threshold: executions before server-side prepare (1 = immediate) prepared-statement-cache-queries: max cached statements per connection prepared-statement-cache-size: memory limit per connection (HOCON memory size syntax)

Property	Value
Default	`1`

database.prepare-threshold = 1

database.prepared-statement-cache-queries

Property	Value
Default	`256`

database.prepared-statement-cache-queries = 256

database.prepared-statement-cache-size

Property	Value
Default	`8 MiB`

database.prepared-statement-cache-size = 8 MiB

database.query-http-acquisition-timeout

Query Connection Limiter

Timeout for HTTP/GraphQL queries when the system is busy. When all connections are in use, HTTP and GraphQL queries wait up to this duration before returning 503 Service Unavailable with a Retry-After header. LDAP queries block up to their query time limit instead.

Default: 5 seconds

Priority: SCRIBE_DATABASE_QUERY_HTTP_ACQUISITION_TIMEOUT > config

Property	Value
Default	`5s`
Override	`SCRIBE_DATABASE_QUERY_HTTP_ACQUISITION_TIMEOUT` (optional)

database.query-http-acquisition-timeout = ${?SCRIBE_DATABASE_QUERY_HTTP_ACQUISITION_TIMEOUT}

database.retry

Retry policy for acquiring database connections. Unset fields inherit from the root retry block.

database.retry.initial-delay

Initial delay between retries while waiting for database connections

Priority: SCRIBE_DATABASE_RETRY_INITIAL_DELAY > config

Property	Value
Default	`100 milliseconds`
Override	`SCRIBE_DATABASE_RETRY_INITIAL_DELAY` (optional)

database.retry.initial-delay = ${?SCRIBE_DATABASE_RETRY_INITIAL_DELAY}

database.retry.jitter

Randomized jitter added to each delay to avoid thundering herds

Priority: SCRIBE_DATABASE_RETRY_JITTER > config

Property	Value
Default	`25 milliseconds`
Override	`SCRIBE_DATABASE_RETRY_JITTER` (optional)

database.retry.jitter = ${?SCRIBE_DATABASE_RETRY_JITTER}

database.retry.max-attempts

Set to >0 to cap retries by attempt count instead of duration (0 = unlimited)

Priority: SCRIBE_DATABASE_RETRY_MAX_ATTEMPTS > config

Property	Value
Default	`0`
Override	`SCRIBE_DATABASE_RETRY_MAX_ATTEMPTS` (optional)

database.retry.max-attempts = ${?SCRIBE_DATABASE_RETRY_MAX_ATTEMPTS}

database.retry.max-delay

Maximum delay between retries

Priority: SCRIBE_DATABASE_RETRY_MAX_DELAY > config

Property	Value
Default	`5 seconds`
Override	`SCRIBE_DATABASE_RETRY_MAX_DELAY` (optional)

database.retry.max-delay = ${?SCRIBE_DATABASE_RETRY_MAX_DELAY}

database.retry.max-duration

Maximum time spent retrying before surfacing a timeout (leave unset to retry indefinitely)

Priority: SCRIBE_DATABASE_RETRY_MAX_DURATION > config

Property	Value
Default	`null`
Override	`SCRIBE_DATABASE_RETRY_MAX_DURATION` (optional)

database.retry.max-duration = ${?SCRIBE_DATABASE_RETRY_MAX_DURATION}

database.ssl

defaults to the root ssl configuration

database.ssl.ca

The location of the root certificate for authenticating the server. File containing the root certificate when validating server (mode = “verify-ca” or “verify-full”). Default will be the file “root.crt” in “$HOME/.postgresql” (*nix) or “%APPDATA%\postgresql” (windows).

Priority: SCRIBE_DATABASE_SSL_CA > config > ssl.ca

Property	Value
Default	`"relative/from/config/file/ca.pem"`
Override	`SCRIBE_DATABASE_SSL_CA` (optional)

database.ssl.ca = ${?SCRIBE_DATABASE_SSL_CA}

database.ssl.cert

The location of the client’s SSL certificate File containing the SSL Certificate. Default will be the file “postgresql.crt” in “$HOME/.postgresql” (*nix) or “%APPDATA%\postgresql” (windows).

Priority: SCRIBE_DATABASE_SSL_CERT > config > ssl.cert

Property	Value
Default	`"relative/from/config/file/cert.pem"`
Override	`SCRIBE_DATABASE_SSL_CERT` (optional)

database.ssl.cert = ${?SCRIBE_DATABASE_SSL_CERT}

database.ssl.enabled

Must be set to ‘true’ to enable SSL for the database

Priority: SCRIBE_DATABASE_SSL_ENABLED > config

Property	Value
Default	`false`
Override	`SCRIBE_DATABASE_SSL_ENABLED` (optional)

database.ssl.enabled = ${?SCRIBE_DATABASE_SSL_ENABLED}

database.ssl.key

The location of the client’s PKCS#8 SSL key. File containing the SSL Key. Default will be the file “postgresql.pk8” in “$HOME/.postgresql” (*nix) or “%APPDATA%\postgresql” (windows).

Priority: SCRIBE_DATABASE_SSL_KEY > config > ssl.key

Property	Value
Default	`"relative/from/config/file/cert.pem"`
Override	`SCRIBE_DATABASE_SSL_KEY` (optional)

database.ssl.key = ${?SCRIBE_DATABASE_SSL_KEY}

database.ssl.password

The password for the client’s ssl key.

Priority: SCRIBE_DATABASE_SSL_PASSWORD > config > ssl.password

Property	Value
Default	`"..."`
Override	`SCRIBE_DATABASE_SSL_PASSWORD` (optional)

database.ssl.password = ${?SCRIBE_DATABASE_SSL_PASSWORD}

database.system-pool-size

Size of the system connection pool. This pool handles commit coordination, migrations, maintenance, and entry preparation.

Default: max(transcribeCount + 4, concurrency / 4)

Minimum: 2

Maximum: max-pool-size / 2 The default scales with both the number of configured transcribes and concurrency level to handle workloads where multiple transcribes perform system operations simultaneously.

Example: With 4 transcribes and concurrency = 16:

Default = max(4 + 4, 16/4) = max(8, 4) = 8 connections

Priority: SCRIBE_DATABASE_SYSTEM_POOL_SIZE > config

Property	Value
Default	`6`
Override	`SCRIBE_DATABASE_SYSTEM_POOL_SIZE` (optional)

database.system-pool-size = ${?SCRIBE_DATABASE_SYSTEM_POOL_SIZE}

database.url

https://www.prisma.io/dataguide/postgresql/short-guides/connection-uris https://jdbc.postgresql.org/documentation/use/#connecting-to-the-database Multiple servers can be specified, separated by a comma, eg “postgres://server1,server2” Options can be set in the URL or as separate properties

Priority: SCRIBE_DATABASE_URL > config

url = “postgres:”${user.name}

Property	Value
Override	`SCRIBE_DATABASE_URL` (optional)

database.url = ${?SCRIBE_DATABASE_URL}

database.user

Priority: SCRIBE_DATABASE_USER > config

user = ${user.name}

Property	Value
Override	`SCRIBE_DATABASE_USER` (optional)

database.user = ${?SCRIBE_DATABASE_USER}