Database Tuning
You have a PostgreSQL server for IdentityScribe. These settings size memory, storage, and WAL for IdentityScribe’s workload.
For caller-side index selection and query-shape guidance, see Query performance.
All formulas assume a dedicated database server. If PostgreSQL shares a host with IdentityScribe, reduce memory fractions accordingly.
Workload profile
Section titled “Workload profile”IdentityScribe’s database workload has distinct characteristics that inform tuning:
- Three isolated connection pools — a default total of ~43 connections serving different workloads: write-heavy transcription tasks, schema maintenance, and latency-sensitive query serving (LDAP, REST, GraphQL).
- Partition-pruned leaf tables —
entries_datais sub-partitioned by entry type and attribute, producing dozens of leaf partitions. At 10M entries, a typical deployment has 27+ indexes across 16 partitions. - Multiple index types per attribute — equality and range B-tree indexes, sort-order B-tree indexes, and GIN trigram indexes for substring search. GIN indexes dominate disk usage (the largest can exceed 1 GB per attribute).
- Read-dominant steady state — equality lookups, range scans with cursor pagination, substring search, and sorted result sets. All served concurrently from LDAP, REST, and GraphQL channels.
- Bursty writes — event-sourced single-row inserts at steady state; multi-row bulk writes during reconciliation. When IdentityScribe prepares a new entry type, GIN trigram indexes are built at runtime.
Recommended settings
Section titled “Recommended settings”Settings are parameterized by server RAM and CPU cores.
Memory
Section titled “Memory”| Setting | Formula | 16 GB | 32 GB | 64 GB | Rationale |
|---|---|---|---|---|---|
shared_buffers | RAM × 0.25 | 4 GB | 8 GB | 16 GB | Must hold hot partition data and GIN indexes for query serving. Undersizing causes buffer churn — every query evicts pages needed by the next |
effective_cache_size | RAM × 0.75 | 12 GB | 24 GB | 48 GB | Reflects OS page cache availability so PostgreSQL can favor index-backed reads when hot data fits in cache |
work_mem | RAM / (connections × 4) | 16 MB | 32 MB | 64 MB | Per-sort-node per-connection. Too high risks OOM under load; too low forces disk sorts on sorted pagination queries |
maintenance_work_mem | min(RAM × 0.05, 2 GB) | 800 MB | 1.6 GB | 2 GB | GIN trigram index builds are memory-intensive. Directly affects how fast new entry types become available |
Session-level work_mem override
Section titled “Session-level work_mem override”The table above recommends server-level work_mem for postgresql.conf. IdentityScribe also sets work_mem = 16MB per query connection via database.connection-hints.session-flags.work-mem. This connection-level setting overrides the server default for IdentityScribe’s connections.
16MB works for most deployments. If query performance degrades over time or under load, insufficient work_mem can cause PostgreSQL to spill sort and hash operations to disk — which is orders of magnitude slower than in-memory processing.
Observe detects spills automatically. When it finds disk spilling, it recommends a work_mem value and provides a ready-to-use config snippet. You can also calculate manually:
Tuning formula
Section titled “Tuning formula”A safe starting point for work_mem given your server’s total RAM and connection count:
work_mem = max(16MB, min(512MB, total_ram_mb / max_connections / 4))For a server with 32GB RAM and 100 connections:
work_mem = max(16MB, min(512MB, 32768 / 100 / 4)) ≈ 82MB → round to 64MB or 96MBStart at the next standard size above the formula result (16, 32, 64, 96, 128, 256, 512 MB). Monitor temp file counts after each increase — if they stop growing, you have enough.
How to adjust
Section titled “How to adjust”Override per-channel or globally in IdentityScribe config:
database.connection-hints.session-flags { work-mem = "64MB"}Or per-channel (e.g., only for REST API queries):
channels.rest.connection-hints.session-flags { work-mem = "128MB"}The value must match <number><unit> where unit is B, kB, MB, GB, or TB (e.g., "256MB", "1GB").
Raising work_mem increases memory pressure. A single query can use work_mem multiple times — once per sort or hash operation, and once per parallel worker. On a server running many concurrent queries, doubling work_mem across the board can trigger out-of-memory errors. Raise gradually and verify in Observe that temporary-file pressure falls before moving higher.
Storage
Section titled “Storage”| Setting | Value | Rationale |
|---|---|---|
random_page_cost | 1.1 | SSD random reads are nearly as fast as sequential. Helps PostgreSQL favor index-backed reads, which is critical for partition-pruned queries |
effective_io_concurrency | 200 | SSD can handle many concurrent read requests. Benefits bitmap heap scans on GIN trigram results |
For HDD storage, use random_page_cost = 2.0 and effective_io_concurrency = 4.
Parallelism
Section titled “Parallelism”| Setting | Formula | 4 cores | 8 cores | 16 cores |
|---|---|---|---|---|
max_parallel_workers_per_gather | cores / 4 | 1 | 2 | 4 |
max_parallel_maintenance_workers | cores / 2 | 2 | 4 | 4 |
max_parallel_workers | cores / 2 | 2 | 4 | 8 |
IdentityScribe uses parallel query execution for equality lookups at scale — each equality filter is translated to a direct join predicate that PostgreSQL can split across worker processes via Gather Merge.
Maintenance parallelism is higher to speed up index creation when new entry types are prepared.
Partition-wise operations
Section titled “Partition-wise operations”| Setting | Value | Rationale |
|---|---|---|
enable_partitionwise_join | on | IdentityScribe’s schema is partitioned by entry type and attribute. Without this, PostgreSQL joins full parent tables and prunes afterward. With it, joins target individual partitions directly |
enable_partitionwise_aggregate | on | Enables per-partition GROUP BY aggregation before merging results. Benefits sort and cursor queries |
These settings are off by default in PostgreSQL because they add query-preparation overhead for non-partitioned schemas. IdentityScribe enables both per-session automatically on every query connection, so they work out of the box without server configuration.
Setting them server-level in postgresql.conf is still recommended — it avoids the per-connection SET overhead and ensures all connections (including ad-hoc psql sessions and monitoring queries) benefit.
WAL and checkpoints
Section titled “WAL and checkpoints”| Setting | Value | Rationale |
|---|---|---|
wal_level | replica | Required — event sourcing requires WAL for crash recovery and potential replication |
synchronous_commit | on | Required — event sourcing demands durable writes. Losing committed events breaks sync state |
max_wal_size | RAM × 0.125 (2–8 GB) | Larger WAL before forced checkpoint. Sync writes are bursty during reconciliation |
checkpoint_timeout | 10–15 min | Reduces checkpoint frequency. Default 5 min causes excessive I/O during sustained writes |
checkpoint_completion_target | 0.9 | Spread checkpoint writes over 90% of the interval (default; keep as-is) |
Connections
Section titled “Connections”| Setting | Formula | Rationale |
|---|---|---|
max_connections | pool total + 15 | Sum of all connection pools, plus headroom for admin, monitoring, and migrations. Default pool total is ~43, so 60 is a safe starting point |
Autovacuum
Section titled “Autovacuum”| Setting | Value | Rationale |
|---|---|---|
autovacuum | on | Required — partitioned tables with frequent updates need regular vacuum to prevent bloat and maintain visibility maps |
autovacuum_vacuum_scale_factor | 0.05 | More aggressive than default (0.2) — partitions are smaller, so the default leaves too many dead tuples proportionally |
autovacuum_analyze_scale_factor | 0.02 | Keeps statistics fresh. Stale partition stats reduce pruning efficiency and can slow filter-heavy searches |
Settings that must not be used in production
Section titled “Settings that must not be used in production”Example configuration
Section titled “Example configuration”For a 32 GB RAM / 8-core SSD server with default IdentityScribe pool sizes:
# Memoryshared_buffers = 8GBeffective_cache_size = 24GBwork_mem = 32MBmaintenance_work_mem = 1536MB
# Storage (SSD)random_page_cost = 1.1effective_io_concurrency = 200
# Parallelismmax_parallel_workers_per_gather = 2max_parallel_maintenance_workers = 4max_parallel_workers = 4
# Partition-wise operations (required for IdentityScribe)enable_partitionwise_join = onenable_partitionwise_aggregate = on
# WAL and Checkpointsmax_wal_size = 4GBcheckpoint_timeout = 10min
# Connectionsmax_connections = 60
# Autovacuum (tuned for partitioned tables)autovacuum_vacuum_scale_factor = 0.05autovacuum_analyze_scale_factor = 0.02After restarting PostgreSQL, confirm buffer allocation with SHOW shared_buffers and monitor query performance on the Health and Monitoring dashboard.
IdentityScribe connection pools
Section titled “IdentityScribe connection pools”IdentityScribe manages three separate HikariCP connection pools to isolate workloads. These are configured in the IdentityScribe config (not postgresql.conf).
| Pool | Used by | Default size | Tuning guidance |
|---|---|---|---|
| Batch | Transcription tasks (main write workload) | concurrency + 5, clamped to concurrency × 1.5 | Increase if hikari.connections.pending is consistently high on the batch pool; decrease on memory-constrained hosts |
| System | Migrations, maintenance, health checks, DDL | max(transcribeCount + 4, concurrency / 4), clamped [2, maxPoolSize / 2] | Increase if maintenance windows overlap with heavy write load |
| Channel | REST, GraphQL, LDAP query serving | concurrency, minimum 2 | Increase if channel latency is dominated by connection acquisition waits (hikari.connections.pending on the channel pool) |
The total connection count is the sum of all three pools. Ensure max_connections in postgresql.conf can accommodate the total (see the Connections table above).
A semaphore (default: channel pool size) caps concurrent query connections across all channels. HTTP and GraphQL queries that exceed this wait up to query-http-acquisition-timeout (default: 5s) and return 503 with a Retry-After header. LDAP queries block up to their query time limit instead.
| Config | Env var | Default |
|---|---|---|
database.max-pool-size | SCRIBE_DATABASE_MAX_POOL_SIZE | concurrency + 5 |
database.system-pool-size | SCRIBE_DATABASE_SYSTEM_POOL_SIZE | max(transcribeCount + 4, concurrency / 4) |
database.channel-pool-size | SCRIBE_DATABASE_CHANNEL_POOL_SIZE | concurrency |
database.query-http-acquisition-timeout | SCRIBE_DATABASE_QUERY_HTTP_ACQUISITION_TIMEOUT | 5s |
Adaptive filtered-search tuning
Section titled “Adaptive filtered-search tuning”For filtered, sorted queries, IdentityScribe adapts between a narrow-match strategy and a broad-match strategy based on how many rows PostgreSQL expects to qualify.
- Narrow-match searches finish quickly when only a small share of rows match.
- Broad-match searches stay predictable when a large share of rows match.
These controls influence when IdentityScribe changes between those search modes. Configurable via:
| Config | Env var | Default | When to change |
|---|---|---|---|
database.filter-selectivity.gate-conjunction-crossover | SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_CONJUNCTION_CROSSOVER | 2.0 | Increase to use the optimized path for more multi-filter searches; decrease to stay conservative when combined filters still match many entries |
database.filter-selectivity.gate-like-base-pattern-length | SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_LIKE_BASE_PATTERN_LENGTH | 4 | Increase if short substring patterns are still too broad in your directory; decrease only if support confirms short patterns are selective enough |
database.filter-selectivity.gate-like-cardinality-reference | SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_LIKE_CARDINALITY_REFERENCE | 100000 | Adjust only with support when entry-type sizes differ heavily from the default model |
database.filter-selectivity.gate-like-adaptive-ceiling | SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_LIKE_ADAPTIVE_CEILING | 20 | Raise to require longer patterns before acceleration on very large directories; lower to cap the threshold earlier |
database.filter-selectivity.gate-like-sort-attr-enabled | SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_LIKE_SORT_ATTR_ENABLED | true | Set false as a safety valve if substring searches on sorted attributes regress in your environment |
database.filter-selectivity.stale-age-horizon | SCRIBE_DATABASE_FILTER_SELECTIVITY_STALE_AGE_HORIZON | 24h | Lower if attribute distributions change rapidly (e.g. after bulk imports) |
database.filter-selectivity.churn-budget-ratio | SCRIBE_DATABASE_FILTER_SELECTIVITY_CHURN_BUDGET_RATIO | 0.01 (1%) | Increase if background refresh overhead is too high; decrease for stricter freshness |
Range search tuning
Section titled “Range search tuning”Range conditions on sortable attributes (e.g., sn >= 'M') can use a pre-computed min/max index instead of scanning the full attribute data. The index stores the minimum and maximum value per entry range, allowing the system to skip entries whose range doesn’t overlap with the filter condition.
IdentityScribe uses the estimated result size to decide when this index is worth using:
- Small result sets use the sort index to avoid reading attribute data for rows that cannot match.
- Large result sets stay on the standard attribute-data path.
These estimates use PostgreSQL statistics that IdentityScribe refreshes during maintenance.
| Config | Env var | Default | When to change |
|---|---|---|---|
database.filter-selectivity.gate-sortable-range-fraction | SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_SORTABLE_RANGE_FRACTION | 0.05 | Increase to use the sort index for more range searches (useful when attribute values are large and the standard path is expensive). Decrease to prefer the standard path when ranges match a large fraction of entries. Set to 0.0 to disable sort-index acceleration for range searches entirely. |
Prefix search optimisation
Section titled “Prefix search optimisation”For prefix searches (startsWith-style patterns), IdentityScribe first measures how many entries match a new pattern, then chooses the faster execution mode for that pattern. The result is cached per (entry type, attribute, prefix) combination and reused on later requests, so only the first search for a previously unseen prefix performs the measurement.
This applies only to prefix-shaped patterns. Contains and suffix patterns use the standard behaviour.
| Config | Env var | Default | When to change |
|---|---|---|---|
database.filter-selectivity.probe.enabled | SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_ENABLED | true | Disable to keep the conservative execution mode for all prefix searches; requires restart |
database.filter-selectivity.probe.timeout | SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_TIMEOUT | 1s | Raise only if you see frequent measurement timeouts on large or heavily loaded databases |
database.filter-selectivity.probe.cache-max-size | SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_CACHE_MAX_SIZE | 1000 | Increase if you serve a large number of distinct search prefixes and observe cache churn |
database.filter-selectivity.probe.cache-ttl | SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_CACHE_TTL | 1h | Decrease if attribute distributions change frequently; increase to reduce measurement frequency |
database.filter-selectivity.probe.stale-mod-ratio | SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_STALE_MOD_RATIO | 0.10 | Decrease to refresh more aggressively after bulk imports |
database.filter-selectivity.probe.probe-max-concurrent | SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_MAX_CONCURRENT | 4 | Increase on high-concurrency systems with spare system pool capacity; decrease to limit connection pressure |
database.filter-selectivity.probe.probe-failure-cooldown | SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_FAILURE_COOLDOWN | 30s | Raise to reduce retries on flaky databases; lower for faster recovery after an issue resolves |
database.filter-selectivity.probe.partition-stats-cache-ttl | SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_PARTITION_STATS_CACHE_TTL | 1m | Decrease to detect data changes faster; increase to reduce background statistics queries |
Adaptive substring search
Section titled “Adaptive substring search”Substring filters (cn=*hugo*), prefix scans, and multi-term substring
searches can perform very differently across attributes and directory sizes. A
baseline that works well for cn may be too aggressive or too conservative for
mail, and it can become stale as the directory grows.
IdentityScribe now learns per-attribute behaviour from real traffic. It starts from conservative baseline values, observes how many entries match incoming searches, and adjusts automatically. No manual tuning is required for standard deployments.
What you can adjust
Section titled “What you can adjust”database.gate-learning { enabled = true
persistence.flush-interval = 5m
defaults { single-substring = 6.0 substring-or = 15.0 same-attribute-mixed-or = 3.0 }}| Config | Default | When to change |
|---|---|---|
database.gate-learning.enabled | true | Set false to disable learning and revert to fixed baseline values (see kill switch below) |
database.gate-learning.persistence.flush-interval | 5m | Shorten (e.g. 30s) if the JVM restarts frequently and you want learned search behaviour preserved across restarts with minimal warm-up time |
database.gate-learning.defaults.single-substring | 6.0 | Cold-start baseline for single substring filters. Leave at the default unless support recommends a change |
database.gate-learning.defaults.substring-or | 15.0 | Cold-start baseline for multi-term substring searches. Leave at the default unless support recommends a change |
database.gate-learning.defaults.same-attribute-mixed-or | 3.0 | Cold-start baseline for mixed searches on a single attribute. Leave at the default unless support recommends a change |
The defaults.* values are the startup baselines used until enough traffic has
accumulated for the learned per-attribute state to stabilise. In most
deployments this happens within minutes of first production queries, and the
starting behaviour is already suitable for the majority of workloads.
What you can observe
Section titled “What you can observe”The primary metrics track which search mode was chosen and whether calibration is healthy:
-
scribe.gate.decision.total— counts every adaptive search decision. Labels includescribe.gate.routing_path(the selected search mode),scribe.gate.routing_source(whether the choice came from a startup baseline, learned state, or a direct measurement), andscribe.gate.confidence(calibration confidence:COLD,LOW,MEDIUM, orHIGH). A sustainedrouting_source="DEFAULT"on a busy attribute means the system has not yet seen enough traffic for that attribute to move beyond cold start. This is normal for newly added attributes or rarely searched ones. -
scribe.gate.misroute_regret.total— counts cases where later feedback showed a different search mode would likely have been faster. A non-zero value is expected during warm-up. If this counter climbs steadily for the same kind of search after 24 h of production traffic, the learning has not converged. Contact support. -
scribe.gate.probe.failures.total— counts background calibration failures. A short burst can happen while a new attribute is still being indexed; sustained failures mean the system cannot collect fresh calibration data for that kind of search.
Kill switch
Section titled “Kill switch”If you observe a search-performance regression on a specific attribute and need immediate relief, disable learning:
database.gate-learning.enabled = falseWith learning disabled, all adaptive decisions use the fixed defaults.*
values shown above, which are the same startup baselines used on cold start.
Behaviour becomes deterministic and matches the pre-learning mode. Re-enable
once the regression is understood.
Statistics target
Section titled “Statistics target”IdentityScribe dynamically computes the PostgreSQL statistics target before each maintenance refresh, adjusting it based on directory size. This keeps filter and sort performance more consistent at scale without the overhead of a blanket high target.
The computed target is logged at the start of each maintenance window.
| Config | Env var | Default |
|---|---|---|
database.maintenance.statistics-target-ratio | SCRIBE_DATABASE_MAINTENANCE_STATISTICS_TARGET_RATIO | 0.00005 |
Set to 0 to disable dynamic tuning and use PostgreSQL’s default.
Sort index backfill
Section titled “Sort index backfill”Sort indexes store a compact min/max range per entry for each sortable attribute. These indexes are built in the background at startup; the service remains fully operational during the build window.
Backfill concurrency auto-scales from available system pool resources, reserving enough connections for maintenance and health checks.
| Config | Env var | Default | When to change |
|---|---|---|---|
database.sort-index-backfill.max-concurrent | SCRIBE_DATABASE_SORT_INDEX_BACKFILL_MAX_CONCURRENT | Auto-scaled | Increase to shorten backfill window on high-spec systems; decrease to reduce read pressure during initial sync |
Quick reference
Section titled “Quick reference”All IdentityScribe database configuration environment variables:
| Env var | Default |
|---|---|
SCRIBE_DATABASE_MAX_POOL_SIZE | concurrency + 5 |
SCRIBE_DATABASE_SYSTEM_POOL_SIZE | max(transcribeCount + 4, concurrency / 4) |
SCRIBE_DATABASE_CHANNEL_POOL_SIZE | concurrency |
SCRIBE_DATABASE_QUERY_HTTP_ACQUISITION_TIMEOUT | 5s |
SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_CONJUNCTION_CROSSOVER | 2.0 |
SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_SORTABLE_RANGE_FRACTION | 0.05 |
SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_LIKE_BASE_PATTERN_LENGTH | 4 |
SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_LIKE_CARDINALITY_REFERENCE | 100000 |
SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_LIKE_ADAPTIVE_CEILING | 20 |
SCRIBE_DATABASE_FILTER_SELECTIVITY_GATE_LIKE_SORT_ATTR_ENABLED | true |
SCRIBE_DATABASE_FILTER_SELECTIVITY_STALE_AGE_HORIZON | 24h |
SCRIBE_DATABASE_FILTER_SELECTIVITY_CHURN_BUDGET_RATIO | 0.01 |
SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_ENABLED | true |
SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_TIMEOUT | 1s |
SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_CACHE_MAX_SIZE | 1000 |
SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_CACHE_TTL | 1h |
SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_STALE_MOD_RATIO | 0.10 |
SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_MAX_CONCURRENT | 4 |
SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_FAILURE_COOLDOWN | 30s |
SCRIBE_DATABASE_FILTER_SELECTIVITY_PROBE_PARTITION_STATS_CACHE_TTL | 1m |
SCRIBE_DATABASE_MAINTENANCE_STATISTICS_TARGET_RATIO | 0.00005 |
SCRIBE_DATABASE_SORT_INDEX_BACKFILL_MAX_CONCURRENT | Auto-scaled |
SCRIBE_DATABASE_GATE_LEARNING_ENABLED | true |
SCRIBE_DATABASE_GATE_LEARNING_PERSISTENCE_FLUSH_INTERVAL | 5m |
SCRIBE_DATABASE_GATE_LEARNING_DEFAULTS_SINGLE_SUBSTRING | 6.0 |
SCRIBE_DATABASE_GATE_LEARNING_DEFAULTS_SUBSTRING_OR | 15.0 |
SCRIBE_DATABASE_GATE_LEARNING_DEFAULTS_SAME_ATTRIBUTE_MIXED_OR | 3.0 |