Database
PostgreSQL database connection and pool configuration.
Collation guidance:
- Recommended (multi-language prod): ICU collation “und-x-icu” for consistent
ordering across platforms. Use TEMPLATE template0 to guarantee your collation/ctype apply (template1 may carry different locale or extensions).
CREATE DATABASE "identity_scribe" TEMPLATE template0 ENCODING = 'UTF8' LC_COLLATE = 'und-x-icu' LC_CTYPE = 'und-x-icu' CONNECTION LIMIT = -1;- Performance-focused / ASCII-only prod: choose “C” for maximum speed; keep
it consistent across all environments.
CREATE DATABASE "identity_scribe" TEMPLATE template0 ENCODING = 'UTF8' LC_COLLATE = 'C' LC_CTYPE = 'C' CONNECTION LIMIT = -1;Connection pool architecture
Identity Scribe uses three separate connection pools to isolate different workloads:
-
Batch pool - Used by: Transcription tasks (main workload) - Size: Configured via
maxPoolSize(default:concurrency + 5, max:concurrency * 1.5) - Each transcription task uses one connection for its lifetime -
System pool - Used by: System-level operations (commit coordination, migrations, maintenance, entry preparation) - Size: Configurable via
systemPoolSize(optional) - Default:max(transcribeCount + 4, concurrency / 4), clamped between 2 andmaxPoolSize / 2- Scales with both number of transcribes and concurrency level - Isolates system operations from transcription tasks -
Channel pool - Used by: LDAP channel services (IdentityHub, LDAP channels) - Size: Configurable via
channelPoolSize(optional) - Default:concurrency, minimum 2 - Can experience heavy LDAP traffic with multiple channels - Isolates channel operations from batch and commit workloads
Total maximum connections (when all pools are fully utilized): = Batch Pool + System Pool + Channel Pool = maxPoolSize + systemPoolSize + channelPoolSize
Example with default settings (concurrency = 16, 2 transcribes):
- Batch Pool: 16 + 5 = 21 connections (default maxPoolSize)
- System Pool: max(2 + 4, 16/4) = max(6, 4) = 6 connections
- Channel Pool: 16 connections
- Total Maximum: 21 + 6 + 16 = 43 connections
Example with maxPoolSize explicitly set to 50 (concurrency = 16, 2 transcribes):
- Batch Pool: 50 connections
- System Pool: max(2 + 4, 16/4) = 6 connections (capped at 50/2 = 25)
- Channel Pool: 16 connections
- Total Maximum: 50 + 6 + 16 = 72 connections
Tuning guidelines:
- Monitor pool metrics: Track
hikari.connections.active,hikari.connections.idle,
and hikari.connections.pending for each pool
- Signs of under-sizing: If
pendingconnections are consistently high oractiveequals
maximum for extended periods, consider increasing pool size
- Signs of over-sizing: If
idleconnections consistently exceed 50% ofmaximum,
consider reducing pool size to free database resources
- High concurrency workloads: Many more tasks can be queued than
concurrency, but each task still uses one connection. Monitor actual connection utilization rather than task count.
database.channel-pool-size
Section titled “database.channel-pool-size”Size of the channel connection pool. This pool handles LDAP channel operations (IdentityHub, LDAP channels).
Default: concurrency
Minimum: 2
Channels can experience heavy LDAP traffic, especially with multiple active channels. Increase for high-throughput read-heavy workloads.
Example: With concurrency = 16:
Default = 16 connectionsPriority: SCRIBE_DATABASE_CHANNEL_POOL_SIZE > config
| Property | Value |
|---|---|
| Default | 16 |
| Override | SCRIBE_DATABASE_CHANNEL_POOL_SIZE (optional) |
database.channel-pool-size = ${?SCRIBE_DATABASE_CHANNEL_POOL_SIZE}database.connection-hints
Section titled “database.connection-hints”Connection Hints (channel query defaults)
Session-level hints applied to channel queries (REST, LDAP, GraphQL, gRPC). All keys are unset by default (PostgreSQL defaults apply). Channels can override these defaults in their own connection-hints section. These hints control PostgreSQL session parameters for query execution. They are applied when opening a connection and restored when the query completes.
database.connection-hints.lock-timeout
Section titled “database.connection-hints.lock-timeout”Lock acquisition timeout.
PostgreSQL will abort any statement that waits longer than this for a lock. Format: HOCON duration (e.g., 5s, 30s, 1m)
Priority: SCRIBE_DATABASE_LOCK_TIMEOUT > config
| Property | Value |
|---|---|
| Default | null |
| Override | SCRIBE_DATABASE_LOCK_TIMEOUT (optional) |
database.connection-hints.lock-timeout = ${?SCRIBE_DATABASE_LOCK_TIMEOUT}database.connection-hints.session-flags
Section titled “database.connection-hints.session-flags”Session flags (comma-separated).
Supported flags:
- force-custom-plan: Forces PostgreSQL to use custom plans for prepared statements
- jit-off: Disables JIT compilation
Priority: SCRIBE_DATABASE_SESSION_FLAGS > config
| Property | Value |
|---|---|
| Default | null |
| Override | SCRIBE_DATABASE_SESSION_FLAGS (optional) |
database.connection-hints.session-flags = ${?SCRIBE_DATABASE_SESSION_FLAGS}database.connection-hints.statement-timeout
Section titled “database.connection-hints.statement-timeout”Statement execution timeout (prevents runaway queries). PostgreSQL will abort any statement that takes longer than this duration. Format: HOCON duration (e.g., 30s, 1m, 500ms)
Priority: SCRIBE_DATABASE_STATEMENT_TIMEOUT > config
| Property | Value |
|---|---|
| Default | null |
| Override | SCRIBE_DATABASE_STATEMENT_TIMEOUT (optional) |
database.connection-hints.statement-timeout = ${?SCRIBE_DATABASE_STATEMENT_TIMEOUT}database.connection-hints.work-mem
Section titled “database.connection-hints.work-mem”PostgreSQL work_mem per-query memory allocation. Controls the amount of memory used for internal sort operations and hash tables. Format: PostgreSQL memory size (e.g., “64MB”, “256MB”, “1GB”) Valid units: B, kB, MB, GB, TB (case-insensitive)
Priority: SCRIBE_DATABASE_WORK_MEM > config
| Property | Value |
|---|---|
| Default | null |
| Override | SCRIBE_DATABASE_WORK_MEM (optional) |
database.connection-hints.work-mem = ${?SCRIBE_DATABASE_WORK_MEM}database.maintenance
Section titled “database.maintenance”Maintenance scheduling for tasks like vacuuming, re-indexing, etc.
database.maintenance.cron
Section titled “database.maintenance.cron”Define a cron expression (unix cron format) to run the maintenance task at a specific time Default not set, eg no cron window
Priority: SCRIBE_DATABASE_MAINTENANCE_CRON > config
cron = ‘0 5 * * *’ # 5am daily
| Property | Value |
|---|---|
| Override | SCRIBE_DATABASE_MAINTENANCE_CRON (optional) |
database.maintenance.cron = ${?SCRIBE_DATABASE_MAINTENANCE_CRON}database.maintenance.enabled
Section titled “database.maintenance.enabled”If enabled, maintenance tasks will be scheduled Defaults to enabled unless readonly is set to true
Priority: SCRIBE_DATABASE_MAINTENANCE_ENABLED > config
| Property | Value |
|---|---|
| Default | true |
| Override | SCRIBE_DATABASE_MAINTENANCE_ENABLED (optional) |
database.maintenance.enabled = ${?SCRIBE_DATABASE_MAINTENANCE_ENABLED}database.maintenance.hard-thresholds
Section titled “database.maintenance.hard-thresholds”Used after interval, to check if system is idle enough to run maintenance
Priority: SCRIBE_DATABASE_MAINTENANCE_HARD_THRESHOLDS > config
Defaults to:
hard-thresholds = {scribe_db_connections_active = Math.ceil(Math.pow(Math.log(max-pool-size + channel-pool-size + system-pool-size), 2)) # DB pool load, eg many connections are active}| Property | Value |
|---|---|
| Override | SCRIBE_DATABASE_MAINTENANCE_HARD_THRESHOLDS (optional) |
database.maintenance.hard-thresholds = ${?SCRIBE_DATABASE_MAINTENANCE_HARD_THRESHOLDS}database.maintenance.interval
Section titled “database.maintenance.interval”Runs based on system load metrics, at least every 36 hours Only a well-known set of threshold metrics is supported (no labels/selectors):
- scribe_ingest_tasks_active
- scribe_ingest_task_pressure
- scribe_ingest_queue_pressure
- scribe_db_connections_active
- scribe_ldap_connections_active
- scribe_query_permit_pressure
Arbitrary metric names are NOT supported. These values are queried directly from ScribeMetrics (not Prometheus). The default value for interval is 36 hours, if no interval or cron is set
Priority: SCRIBE_DATABASE_MAINTENANCE_INTERVAL > config
| Property | Value |
|---|---|
| Default | 36 hours |
| Override | SCRIBE_DATABASE_MAINTENANCE_INTERVAL (optional) |
database.maintenance.interval = ${?SCRIBE_DATABASE_MAINTENANCE_INTERVAL}database.maintenance.soft-thresholds
Section titled “database.maintenance.soft-thresholds”Used after half of interval, to check if system is idle enough to run maintenance
Priority: SCRIBE_DATABASE_MAINTENANCE_SOFT_THRESHOLDS > config
Defaults to:
soft-thresholds = {scribe_db_connections_active = Math.ceil(Math.log(max-pool-size + channel-pool-size + system-pool-size)) # DB pool load, eg not many connections are active}| Property | Value |
|---|---|
| Override | SCRIBE_DATABASE_MAINTENANCE_SOFT_THRESHOLDS (optional) |
database.maintenance.soft-thresholds = ${?SCRIBE_DATABASE_MAINTENANCE_SOFT_THRESHOLDS}database.max-pool-size
Section titled “database.max-pool-size”Maximum size of the batch connection pool. This pool is used by transcription tasks - each task uses one connection for its lifetime.
Default: concurrency + 5
Minimum: 5
Maximum: concurrency * 1.5 (rounded up) When setting this value, remember that the total connection count includes:
- This batch pool (max-pool-size)
- System pool (configurable via
system-pool-size, default: max(transcribeCount + 4, concurrency / 4)) - Channel pool (configurable via
channel-pool-size, default: concurrency)
Example: With max-pool-size = 50, concurrency = 16, and 2 transcribes:
Total max connections = 50 (batch) + 6 (system) + 16 (channel) = 72 connections
Priority: SCRIBE_DATABASE_MAX_POOL_SIZE > config
| Property | Value |
|---|---|
| Default | 21 |
| Override | SCRIBE_DATABASE_MAX_POOL_SIZE (optional) |
database.max-pool-size = ${?SCRIBE_DATABASE_MAX_POOL_SIZE}database.password
Section titled “database.password”Priority: SCRIBE_DATABASE_PASSWORD > config
| Property | Value |
|---|---|
| Default | "" |
| Override | SCRIBE_DATABASE_PASSWORD (optional) |
database.password = ${?SCRIBE_DATABASE_PASSWORD}database.prepare-threshold
Section titled “database.prepare-threshold”Prepared Statement Caching (advanced tuning)
Controls server-side prepared statements and per-connection caching. These improve performance by reusing query plans across executions. Memory footprint: poolSize * prepared-statement-cache-size
Example: 35 connections * 8 MiB = ~280 MiB
prepare-threshold: executions before server-side prepare (1 = immediate) prepared-statement-cache-queries: max cached statements per connection prepared-statement-cache-size: memory limit per connection (HOCON memory size syntax)
| Property | Value |
|---|---|
| Default | 1 |
database.prepare-threshold = 1database.prepared-statement-cache-queries
Section titled “database.prepared-statement-cache-queries”| Property | Value |
|---|---|
| Default | 256 |
database.prepared-statement-cache-queries = 256database.prepared-statement-cache-size
Section titled “database.prepared-statement-cache-size”| Property | Value |
|---|---|
| Default | 8 MiB |
database.prepared-statement-cache-size = 8 MiBdatabase.query-http-acquisition-timeout
Section titled “database.query-http-acquisition-timeout”Query Connection Limiter
Timeout for HTTP/GraphQL queries when the system is busy. When all connections are in use, HTTP and GraphQL queries wait up to this duration before returning 503 Service Unavailable with a Retry-After header. LDAP queries block up to their query time limit instead.
Default: 5 seconds
Priority: SCRIBE_DATABASE_QUERY_HTTP_ACQUISITION_TIMEOUT > config
| Property | Value |
|---|---|
| Default | 5s |
| Override | SCRIBE_DATABASE_QUERY_HTTP_ACQUISITION_TIMEOUT (optional) |
database.query-http-acquisition-timeout = ${?SCRIBE_DATABASE_QUERY_HTTP_ACQUISITION_TIMEOUT}database.retry
Section titled “database.retry”Retry policy for acquiring database connections. Unset fields inherit from the root retry block.
database.retry.initial-delay
Section titled “database.retry.initial-delay”Initial delay between retries while waiting for database connections
Priority: SCRIBE_DATABASE_RETRY_INITIAL_DELAY > config
| Property | Value |
|---|---|
| Default | 100 milliseconds |
| Override | SCRIBE_DATABASE_RETRY_INITIAL_DELAY (optional) |
database.retry.initial-delay = ${?SCRIBE_DATABASE_RETRY_INITIAL_DELAY}database.retry.jitter
Section titled “database.retry.jitter”Randomized jitter added to each delay to avoid thundering herds
Priority: SCRIBE_DATABASE_RETRY_JITTER > config
| Property | Value |
|---|---|
| Default | 25 milliseconds |
| Override | SCRIBE_DATABASE_RETRY_JITTER (optional) |
database.retry.jitter = ${?SCRIBE_DATABASE_RETRY_JITTER}database.retry.max-attempts
Section titled “database.retry.max-attempts”Set to >0 to cap retries by attempt count instead of duration (0 = unlimited)
Priority: SCRIBE_DATABASE_RETRY_MAX_ATTEMPTS > config
| Property | Value |
|---|---|
| Default | 0 |
| Override | SCRIBE_DATABASE_RETRY_MAX_ATTEMPTS (optional) |
database.retry.max-attempts = ${?SCRIBE_DATABASE_RETRY_MAX_ATTEMPTS}database.retry.max-delay
Section titled “database.retry.max-delay”Maximum delay between retries
Priority: SCRIBE_DATABASE_RETRY_MAX_DELAY > config
| Property | Value |
|---|---|
| Default | 5 seconds |
| Override | SCRIBE_DATABASE_RETRY_MAX_DELAY (optional) |
database.retry.max-delay = ${?SCRIBE_DATABASE_RETRY_MAX_DELAY}database.retry.max-duration
Section titled “database.retry.max-duration”Maximum time spent retrying before surfacing a timeout (leave unset to retry indefinitely)
Priority: SCRIBE_DATABASE_RETRY_MAX_DURATION > config
| Property | Value |
|---|---|
| Default | null |
| Override | SCRIBE_DATABASE_RETRY_MAX_DURATION (optional) |
database.retry.max-duration = ${?SCRIBE_DATABASE_RETRY_MAX_DURATION}database.ssl
Section titled “database.ssl”defaults to the root ssl configuration
database.ssl.ca
Section titled “database.ssl.ca”The location of the root certificate for authenticating the server. File containing the root certificate when validating server (mode = “verify-ca” or “verify-full”). Default will be the file “root.crt” in “$HOME/.postgresql” (*nix) or “%APPDATA%\postgresql” (windows).
Priority: SCRIBE_DATABASE_SSL_CA > config > ssl.ca
| Property | Value |
|---|---|
| Default | "relative/from/config/file/ca.pem" |
| Override | SCRIBE_DATABASE_SSL_CA (optional) |
database.ssl.ca = ${?SCRIBE_DATABASE_SSL_CA}database.ssl.cert
Section titled “database.ssl.cert”The location of the client’s SSL certificate File containing the SSL Certificate. Default will be the file “postgresql.crt” in “$HOME/.postgresql” (*nix) or “%APPDATA%\postgresql” (windows).
Priority: SCRIBE_DATABASE_SSL_CERT > config > ssl.cert
| Property | Value |
|---|---|
| Default | "relative/from/config/file/cert.pem" |
| Override | SCRIBE_DATABASE_SSL_CERT (optional) |
database.ssl.cert = ${?SCRIBE_DATABASE_SSL_CERT}database.ssl.enabled
Section titled “database.ssl.enabled”Must be set to ‘true’ to enable SSL for the database
Priority: SCRIBE_DATABASE_SSL_ENABLED > config
| Property | Value |
|---|---|
| Default | false |
| Override | SCRIBE_DATABASE_SSL_ENABLED (optional) |
database.ssl.enabled = ${?SCRIBE_DATABASE_SSL_ENABLED}database.ssl.key
Section titled “database.ssl.key”The location of the client’s PKCS#8 SSL key. File containing the SSL Key. Default will be the file “postgresql.pk8” in “$HOME/.postgresql” (*nix) or “%APPDATA%\postgresql” (windows).
Priority: SCRIBE_DATABASE_SSL_KEY > config > ssl.key
| Property | Value |
|---|---|
| Default | "relative/from/config/file/cert.pem" |
| Override | SCRIBE_DATABASE_SSL_KEY (optional) |
database.ssl.key = ${?SCRIBE_DATABASE_SSL_KEY}database.ssl.password
Section titled “database.ssl.password”The password for the client’s ssl key.
Priority: SCRIBE_DATABASE_SSL_PASSWORD > config > ssl.password
| Property | Value |
|---|---|
| Default | "..." |
| Override | SCRIBE_DATABASE_SSL_PASSWORD (optional) |
database.ssl.password = ${?SCRIBE_DATABASE_SSL_PASSWORD}database.system-pool-size
Section titled “database.system-pool-size”Size of the system connection pool. This pool handles commit coordination, migrations, maintenance, and entry preparation.
Default: max(transcribeCount + 4, concurrency / 4)
Minimum: 2
Maximum: max-pool-size / 2 The default scales with both the number of configured transcribes and concurrency level to handle workloads where multiple transcribes perform system operations simultaneously.
Example: With 4 transcribes and concurrency = 16:
Default = max(4 + 4, 16/4) = max(8, 4) = 8 connectionsPriority: SCRIBE_DATABASE_SYSTEM_POOL_SIZE > config
| Property | Value |
|---|---|
| Default | 6 |
| Override | SCRIBE_DATABASE_SYSTEM_POOL_SIZE (optional) |
database.system-pool-size = ${?SCRIBE_DATABASE_SYSTEM_POOL_SIZE}database.url
Section titled “database.url”https://www.prisma.io/dataguide/postgresql/short-guides/connection-uris https://jdbc.postgresql.org/documentation/use/#connecting-to-the-database Multiple servers can be specified, separated by a comma, eg “postgres://server1,server2” Options can be set in the URL or as separate properties
Priority: SCRIBE_DATABASE_URL > config
url = “postgres:”${user.name}
| Property | Value |
|---|---|
| Override | SCRIBE_DATABASE_URL (optional) |
database.url = ${?SCRIBE_DATABASE_URL}database.user
Section titled “database.user”Priority: SCRIBE_DATABASE_USER > config
user = ${user.name}
| Property | Value |
|---|---|
| Override | SCRIBE_DATABASE_USER (optional) |
database.user = ${?SCRIBE_DATABASE_USER}