Skip to content

Monitoring

Monitoring endpoints and telemetry configuration. Includes Prometheus metrics, health probes, and OpenTelemetry tracing.

Default authentication settings for monitoring endpoints. Categories (prometheus, health, observe, mcp) inherit from here unless they override. This inherits from the root auth {} configuration.

Example: Secure all monitoring endpoints by default

monitoring.auth {
enabled = true
methods = [bearer]
}
PropertyValue
OverrideSCRIBE_MONITORING_AUTH_ENABLED (optional)
monitoring.auth.enabled = ${?SCRIBE_MONITORING_AUTH_ENABLED}

PropertyValue
OverrideSCRIBE_MONITORING_AUTH_METHODS (optional)
monitoring.auth.methods = ${?SCRIBE_MONITORING_AUTH_METHODS}

Enable the monitoring server.

PropertyValue
Defaulttrue
monitoring.enabled = true

Exemplars (metrics → traces) for histogram metrics. Exemplars attach trace context (traceId, spanId) to histogram buckets, enabling “click to trace” in Grafana when using Prometheus + Tempo. Values: off - No exemplars recorded (default, minimal overhead) traceBased - Record exemplars only when an active sampled span exists alwaysOn - Record all measurements as exemplar candidates IMPORTANT:

  • traceBased requires tracing to be enabled (monitoring.traces.enabled=true)
  • Exemplars require a metrics exporter to be enabled:
monitoring.prometheus.enabled=true OR monitoring.metrics.enabled=true

External requirements (operator responsibility):

  • Prometheus: —enable-feature=exemplar-storage
  • Grafana: Configure Tempo datasource with exemplar linking

Priority: SCRIBE_METRICS_EXEMPLARS > OTEL_METRICS_EXEMPLAR_FILTER > config

PropertyValue
Default"off"
OverrideSCRIBE_METRICS_EXEMPLARS (optional) > OTEL_METRICS_EXEMPLAR_FILTER (standard)
monitoring.exemplars = "off"
monitoring.exemplars = ${?OTEL_METRICS_EXEMPLAR_FILTER}
monitoring.exemplars = ${?SCRIBE_METRICS_EXEMPLARS}

Health probe endpoints (/livez, /readyz, /startedz, /healthz) Kubernetes-style health probes for liveness, readiness, and startup checks.

Authentication for health probe endpoints.

Default: allow anonymous access (K8s probes typically don’t support auth).

Most Kubernetes configurations use unauthenticated probes. Override to require auth or restrict by IP:

monitoring.health.auth.rules = [
{ action = allow, where = "request.client.ip startswith 10.0." }
]
PropertyValue
OverrideSCRIBE_HEALTH_AUTH_METHODS (optional)
monitoring.health.auth.methods = ${?SCRIBE_HEALTH_AUTH_METHODS}

Allow anonymous access for K8s probes.

PropertyValue
Default[{ id = "health-public", action = allow }]
monitoring.health.auth.rules = [{ id = "health-public", action = allow }]

Enable health probe endpoints.

Priority: SCRIBE_HEALTH_ENABLED > config

PropertyValue
Defaulttrue
OverrideSCRIBE_HEALTH_ENABLED (optional)
monitoring.health.enabled = true
monitoring.health.enabled = ${?SCRIBE_HEALTH_ENABLED}

Socket for health probe endpoints. Falls back to monitoring.socket if unset.

Priority: SCRIBE_HEALTH_SOCKET > config

PropertyValue
OverrideSCRIBE_HEALTH_SOCKET (optional)
monitoring.health.socket = ${?SCRIBE_HEALTH_SOCKET}

Configuration for the operational hint engine. The hint engine detects performance issues and optimization opportunities during LDAP search execution, surfacing actionable recommendations via metrics and structured logs.

Detector modules:

  • Equality/Range Coverage: Detects missing valueMatch indexes
  • Sort/VLV Readiness: Detects missing sortable indexes and non-deterministic ordering
  • Control Fallback: Detects when LDAP controls force delegation or rewrites
  • Type Scope Guidance: Recommends explicit entryType constraints for better performance
  • Startup Validation: Validates indices configuration at application startup

Query signature cache configuration. Controls TTL and size for the query signature cache used by EXPLAIN sampling. This cache stores execution counts, last EXPLAIN timestamps, sequential-scan flags, and SQL text needed for future analysis.

Maximum number of cached query signatures. When cache is full, least-recently-used entries are evicted.

Priority: SCRIBE_HINTS_CACHE_SIZE > config

Default: 500

PropertyValue
Default500
OverrideSCRIBE_HINTS_CACHE_SIZE (optional)
monitoring.hints.cache.size = 500
monitoring.hints.cache.size = ${?SCRIBE_HINTS_CACHE_SIZE}

Time-to-live for signature cache entries. Signatures older than this TTL will be re-EXPLAINed on next execution. Cache uses expire-after-access, so hot signatures stay longer.

Priority: SCRIBE_HINTS_CACHE_TTL > config

Default: 1 hour

PropertyValue
Default1 hour
OverrideSCRIBE_HINTS_CACHE_TTL (optional)
monitoring.hints.cache.ttl = 1 hour
monitoring.hints.cache.ttl = ${?SCRIBE_HINTS_CACHE_TTL}

Per-detector module toggles.

Each detector can be individually enabled or disabled. Defaults to enabled when monitoring.hints.enabled is true.

monitoring.hints.detectors.config-validation

Section titled “monitoring.hints.detectors.config-validation”

Enable/disable startup configuration validation. Validates that all indexed attributes exist in observed attributes and checks virtual attributes for entryType in filters. Emits CONFIG hints for misconfigurations at startup.

PropertyValue
Defaulttrue
monitoring.hints.detectors.config-validation = true

monitoring.hints.detectors.control-fallback

Section titled “monitoring.hints.detectors.control-fallback”

Enable/disable control fallback detection. Emits hints when LDAP controls (e.g., VLV assertion values, unsupported sort attributes) force delegation or in-memory rewrites.

PropertyValue
Defaulttrue
monitoring.hints.detectors.control-fallback = true

Enable/disable equality and range index coverage detection. Emits hints when equality (=), greaterOrEqual (>=), or lessOrEqual (<=) filters are used on attributes missing value-match (truncation) indexes.

PropertyValue
Defaulttrue
monitoring.hints.detectors.equality-index = true

Enable/disable sort and VLV readiness detection. Emits hints when sortable indexes are missing or sort order is non-deterministic (multi-valued attributes without tie-breaker).

PropertyValue
Defaulttrue
monitoring.hints.detectors.sort-plan = true

monitoring.hints.detectors.type-scope-guidance

Section titled “monitoring.hints.detectors.type-scope-guidance”

Enable/disable type scope guidance. Emits informational hints recommending explicit (entryType=X) constraints when types are known from DN context but filters lack entryType.

PropertyValue
Defaulttrue
monitoring.hints.detectors.type-scope-guidance = true

Enable or disable hint collection and emission. When disabled, no hints are collected and no metrics are incremented.

Priority: SCRIBE_HINTS_ENABLED > config

PropertyValue
Defaulttrue
OverrideSCRIBE_HINTS_ENABLED (optional)
monitoring.hints.enabled = true
monitoring.hints.enabled = ${?SCRIBE_HINTS_ENABLED}

EXPLAIN-based runtime analysis and query signature tracking. Enables signature-based EXPLAIN sampling to detect sequential scans and correlate query patterns with performance issues.

Enable or disable EXPLAIN sampling and query signature tracking. When disabled, no signatures are computed, no EXPLAIN plans are analyzed, and /observe/signatures endpoint is unavailable. Defaults to enabled when monitoring.hints.enabled is true.

Priority: SCRIBE_HINTS_EXPLAIN_ENABLED > config

PropertyValue
OverrideSCRIBE_HINTS_EXPLAIN_ENABLED (optional)
monitoring.hints.explain.enabled = ${?SCRIBE_HINTS_EXPLAIN_ENABLED}

EXPLAIN command options for enhanced plan analysis. Options control what information is included in EXPLAIN output:

  • ANALYZE: Execute query and show actual runtime statistics (actual rows, actual time)
  • COSTS: Include estimated costs (enabled by default in PostgreSQL)
  • VERBOSE: Include additional details (output columns, schema-qualified names)
  • BUFFERS: Include buffer usage statistics (requires ANALYZE)

Default: [] (empty, uses default PostgreSQL EXPLAIN behavior)

Example: [“ANALYZE”, “BUFFERS”] for runtime stats and buffer usage

PropertyValue
Default[]
monitoring.hints.explain.options = []

Re-EXPLAIN every N query executions for the same signature. Used to detect plan drift and refresh analysis for frequently-executed queries.

Priority: SCRIBE_HINTS_EXPLAIN_SAMPLE_EVERY > config

Default: 1000

PropertyValue
Default1000
OverrideSCRIBE_HINTS_EXPLAIN_SAMPLE_EVERY (optional)
monitoring.hints.explain.sample-every = 1000
monitoring.hints.explain.sample-every = ${?SCRIBE_HINTS_EXPLAIN_SAMPLE_EVERY}

monitoring.hints.explain.slow-query-duration

Section titled “monitoring.hints.explain.slow-query-duration”

Slow query threshold duration. Queries exceeding this duration trigger high-priority EXPLAIN sampling (non-blocking, background execution). This value also controls fast-query hint filtering: sequential scan hints without actionable index recommendations are suppressed when execution time is below sqrt(slow-query-duration) (~31ms at default 1s). This reduces noise from fast queries on small tables that don’t benefit from indexing.

slow-query-durationFast hint threshold
100ms~10ms
500ms~22ms
1000ms (default)~31ms
5000ms~70ms

Priority: SCRIBE_HINTS_EXPLAIN_SLOW_QUERY_DURATION > config

Default: 1000ms

PropertyValue
Default1000ms
OverrideSCRIBE_HINTS_EXPLAIN_SLOW_QUERY_DURATION (optional)
monitoring.hints.explain.slow-query-duration = 1000ms
monitoring.hints.explain.slow-query-duration = ${?SCRIBE_HINTS_EXPLAIN_SLOW_QUERY_DURATION}

Hint persistence configuration.

Persists hints to PostgreSQL for audit and Insights visualization. When enabled, both hint signals and query signatures are persisted. Query signatures store the full normalized query structure (filter, sort, scope, types, controls) for debugging and correlation with hints.

monitoring.hints.persistence.batch-interval

Section titled “monitoring.hints.persistence.batch-interval”

Maximum wait time between batch flushes. Batches are flushed when full or when this interval elapses.

Priority: SCRIBE_HINTS_PERSISTENCE_BATCH_INTERVAL > config

Default: 5 seconds

PropertyValue
Default5 seconds
OverrideSCRIBE_HINTS_PERSISTENCE_BATCH_INTERVAL (optional)
monitoring.hints.persistence.batch-interval = 5 seconds
monitoring.hints.persistence.batch-interval = ${?SCRIBE_HINTS_PERSISTENCE_BATCH_INTERVAL}

Batch size for database inserts. Hints and signatures are batched before writing to reduce database load.

Priority: SCRIBE_HINTS_PERSISTENCE_BATCH_SIZE > config

Default: 100

PropertyValue
Default100
OverrideSCRIBE_HINTS_PERSISTENCE_BATCH_SIZE (optional)
monitoring.hints.persistence.batch-size = 100
monitoring.hints.persistence.batch-size = ${?SCRIBE_HINTS_PERSISTENCE_BATCH_SIZE}

monitoring.hints.persistence.clear-on-startup

Section titled “monitoring.hints.persistence.clear-on-startup”

Clear persisted hints on startup. When true, truncates hint_signals and hint_signatures tables at startup. Ensures hints reflect current configuration rather than stale state from previous runs. Useful when configuration changes (e.g., new indexes) make old hints obsolete.

Priority: SCRIBE_HINTS_PERSISTENCE_CLEAR_ON_STARTUP > config

Default: true

PropertyValue
Defaulttrue
OverrideSCRIBE_HINTS_PERSISTENCE_CLEAR_ON_STARTUP (optional)
monitoring.hints.persistence.clear-on-startup = true
monitoring.hints.persistence.clear-on-startup = ${?SCRIBE_HINTS_PERSISTENCE_CLEAR_ON_STARTUP}

Enable or disable hint persistence. When disabled, hints are only emitted via metrics and logs.

Priority: SCRIBE_HINTS_PERSISTENCE_ENABLED > config

Default: false (opt-in)

PropertyValue
Defaultfalse
OverrideSCRIBE_HINTS_PERSISTENCE_ENABLED (optional)
monitoring.hints.persistence.enabled = false
monitoring.hints.persistence.enabled = ${?SCRIBE_HINTS_PERSISTENCE_ENABLED}

Maximum size of the in-memory persistence queue. When queue is full, newest hints are dropped (FIFO eviction).

Priority: SCRIBE_HINTS_PERSISTENCE_MAX_QUEUE > config

Default: 5000

PropertyValue
Default5000
OverrideSCRIBE_HINTS_PERSISTENCE_MAX_QUEUE (optional)
monitoring.hints.persistence.max-queue = 5000
monitoring.hints.persistence.max-queue = ${?SCRIBE_HINTS_PERSISTENCE_MAX_QUEUE}

Maximum number of hint rows to retain in the database (ring-buffer). When this limit is reached, oldest hints are evicted to make room for new ones.

Priority: SCRIBE_HINTS_PERSISTENCE_MAX_ROWS > config

Default: 50000

PropertyValue
Default50000
OverrideSCRIBE_HINTS_PERSISTENCE_MAX_ROWS (optional)
monitoring.hints.persistence.max-rows = 50000
monitoring.hints.persistence.max-rows = ${?SCRIBE_HINTS_PERSISTENCE_MAX_ROWS}

Rules for filtering which hints are persisted. Rules are evaluated in order - first matching rule wins. Hints that match an “exclude” rule are still emitted via metrics and logs, but not stored in the database. Rule syntax (same as log.rules): { action = include|exclude, where = “filter”, id = “identifier” } The ‘id’ field is optional and defaults to “Rule #N” (1-based). Available attributes for filtering:

  • hint-type: PARTIAL_MATCH_INDEX, EQUALITY_INDEX, SORT_PLAN,

CONTROL_FALLBACK, EXPLAIN_SEQ_SCAN, TYPE_SCOPE_GUIDANCE, CONFIG

  • severity: INFO, WARNING, ERROR
  • attribute: the attribute name (e.g., “cn”, “mail”), null if not applicable
  • entry-type: first entry type if multiple (e.g., “inetOrgPerson”), null if none
  • base-dn: the base DN of the query (e.g., “ou=users,dc=example,dc=com”), null if none

Examples:

Default: [] (empty - persist all hints)

PropertyValue
Default[]
monitoring.hints.persistence.rules = []

Time-to-live for persisted hints and signatures. Hints and signatures older than this TTL are automatically purged.

Priority: SCRIBE_HINTS_PERSISTENCE_TTL > config

Default: 7 days

PropertyValue
Default7 days
OverrideSCRIBE_HINTS_PERSISTENCE_TTL (optional)
monitoring.hints.persistence.ttl = 7 days
monitoring.hints.persistence.ttl = ${?SCRIBE_HINTS_PERSISTENCE_TTL}

HTTP server settings for monitoring endpoints.

This section inherits all settings from http. Override individual settings as needed.


Wide Log Operations

Accumulates context throughout request/task execution and emits a single structured log line when the operation is “interesting” (errors, warnings, slow). Wide events are emitted at INFO/WARN/ERROR level with the format: scribe.log {“trace_id”:”…”,“span_id”:”…”,“duration”:“PT0.123S”,…} Fields always included: duration (ISO-8601), result (ok|<failure_kind>) Optional fields: trace_id, span_id, parent_span_id (when tracing enabled) failure (kind, code, details), warnings, events

Child Segment Tracking

Records child segments as events in wide logs with offset/duration timing. Useful for understanding operation breakdown and identifying slow sub-operations.

Display mode for segment events in PRETTY format. Modes: auto - summary in dev mode, off in prod (default) full - Show all segment events with timing summary - Show counts by name + top 5 slowest segments off - Don’t show segment events (still shows warnings and other events)

Priority: SCRIBE_LOG_CHILDS_DISPLAY > config

PropertyValue
Default"auto"
OverrideSCRIBE_LOG_CHILDS_DISPLAY (optional)
monitoring.log.childs.display = "auto"
monitoring.log.childs.display = ${?SCRIBE_LOG_CHILDS_DISPLAY}

Mode: off | minimal | full | auto

auto - Full in dev mode, minimal in prod (default) off - Don’t track segments minimal - Track name, offset, duration only full - Track name, offset, duration, and segment attributes AUTO behavior uses app.mode to determine environment:

  • Dev mode (app.mode=dev/development/local/test) → full
  • Production mode → minimal

Priority: SCRIBE_LOG_CHILDS_MODE > config

PropertyValue
Default"auto"
OverrideSCRIBE_LOG_CHILDS_MODE (optional)
monitoring.log.childs.mode = "auto"
monitoring.log.childs.mode = ${?SCRIBE_LOG_CHILDS_MODE}

Rules for which segments to track. First matching rule wins. Empty list = track all segments. Rule syntax (same as log.rules): { action = include|exclude, name = “glob”, where = “filter”, id = “identifier” } The ‘id’ field is optional and defaults to “Rule #N” (1-based).

Example: Filter sub-millisecond segments:

rules = [
{ action = exclude, where = "duration.seconds<1ms" }
]
PropertyValue
Default[##, Default:, ...]
monitoring.log.childs.rules = [ ## Default: filter sub-millisecond noise in production
## Exclude segments < 1ms
{ action = exclude, where = "duration.seconds<1ms" }
]

Enable wide-event logging.

When disabled, no wide logs are emitted.

PropertyValue
Defaulttrue
OverrideSCRIBE_LOG_ENABLED (optional)
monitoring.log.enabled = true
monitoring.log.enabled = ${?SCRIBE_LOG_ENABLED}

Filter client errors through log rules. When true (default), client errors (4xx) are subject to log rules and may be suppressed. When false, client errors bypass rules and are always logged (like server errors). Security-relevant errors (UNAUTHENTICATED, PERMISSION_DENIED) are always logged.

Priority: SCRIBE_LOG_FILTER_CLIENT_ERRORS > config

PropertyValue
Defaulttrue
OverrideSCRIBE_LOG_FILTER_CLIENT_ERRORS (optional)
monitoring.log.filter-client-errors = true
monitoring.log.filter-client-errors = ${?SCRIBE_LOG_FILTER_CLIENT_ERRORS}

Log format for wide event emission.

Supported formats:

pretty - Human-friendly multi-line format with auto-grouped attributes, segment timeline, and color support (TTY only) (default) json - Single-line JSON payload (machine-parseable) auto - Pretty in dev mode + TTY, JSON otherwise AUTO behavior uses app.mode to determine environment:

  • Dev mode (app.mode=dev/development/local/test) + TTY → pretty with colors
  • Everything else → json
PropertyValue
Default"pretty"
OverrideSCRIBE_LOG_FORMAT (optional)
monitoring.log.format = "pretty"
monitoring.log.format = ${?SCRIBE_LOG_FORMAT}

Leak Detection

Detects segments that remain open longer than expected, indicating potential resource leaks or forgotten end() calls. When triggered, logs a warning with the creation stack trace for debugging.

Mode: off | on | auto

off - Never run leak detection on - Always run leak detection auto - Enabled when app.mode is dev/development/local/test; disabled otherwise

Priority: SCRIBE_LOG_LEAK_DETECTION > config

PropertyValue
Defaultauto
OverrideSCRIBE_LOG_LEAK_DETECTION (optional)
monitoring.log.leak-detection.mode = auto
monitoring.log.leak-detection.mode = ${?SCRIBE_LOG_LEAK_DETECTION}

Rules for leak detection - first match wins. Use name patterns to exclude known long-running operations. No matching rule = include (emit leak warning).

Examples:

  • Exclude long-running workers: { action = exclude, name = “Transcription.*” }
  • Exclude background jobs: { action = exclude, name = “Background.*” }
PropertyValue
Default[##, MCP, ...]
monitoring.log.leak-detection.rules = [ ## MCP uses SSE streaming - connections legitimately stay open indefinitely
{ action = exclude, name = "HTTP * /mcp" }
{ action = exclude, name = "HTTP * /observe/mcp" }
## Async LDAP searches during sync can run indefinitely - not leaks
{ action = exclude, name = "Ingest.AsyncSearch.*" }
## Reconciliation can take minutes for large directories - not a leak
{ action = exclude, name = "Ingest.Reconciliation" }
## Don't warn for segments that haven't exceeded the global threshold
{ action = exclude, where = "leak.duration.seconds<=90s" }
]

Global minimum threshold for leak detection. Segments must be open at least this long to be considered potential leaks. Rules below can add stricter thresholds for specific segment types.

Priority: SCRIBE_LOG_LEAK_DETECTION_THRESHOLD > config

PropertyValue
Default90s
OverrideSCRIBE_LOG_LEAK_DETECTION_THRESHOLD (optional)
monitoring.log.leak-detection.threshold = 90s
monitoring.log.leak-detection.threshold = ${?SCRIBE_LOG_LEAK_DETECTION_THRESHOLD}

Redaction


Emission Rules

Unified rules for controlling which operations are logged. Rules are evaluated in order - first matching rule wins. Decision flow — each operation is evaluated top-to-bottom; the first match wins:

  • Failures and warnings are always logged regardless of rules.
  • Noteworthy operations (e.g. configuration reloads, license changes) are always logged.
  • Rules are evaluated in order — include logs the operation, exclude suppresses it.
  • Sample rate applies to everything else (operations that no rule matched).

Rule syntax:

{ action = include|exclude, name = “glob”, where = “filter”, id = “identifier” } Fields:

  • action (required): include (log the operation) or exclude (suppress it)
  • name (optional): glob pattern for operation name (e.g., “LDAP.”, ”.ShutDown”)
  • where (optional): filter on attributes. Supports FleX, LDAP, SCIM, and JSON:
  • id (optional): rule identifier for telemetry/logging (defaults to “Rule #N”)

FleX: “scribe.result=ok” (preferred - cleaner syntax)

LDAP: “(scribe.result=ok)”

Operation names follow the format {Channel}.{Operation}:

ChannelOperations
LDAPSearch, Bind, Compare, Modify, Add, Delete
RESTSearch, Modify, Add, Delete
GraphQLSearch, Modify, Add, Delete
GRPCSearch, Modify, Add, Delete

Duration attributes (ending in .seconds) support multiple formats:

  • Plain seconds: 0.1, 90, 5.5
  • HOCON style: 100ms, 5s, 1m, 2h
  • ISO 8601: PT90S, PT1M30S, PT1H

Common examples (FleX is the runtime query language; also works in config):

  • Log all LDAP operations: { action = include, name = “LDAP.*” }
  • Suppress fast successful ops: { action = exclude, where = “scribe.result=ok duration.seconds<=50ms” }
  • Log slow operations: { action = include, where = “duration.seconds >= 5s” }
  • Log errors or slow: { action = include, where = “scribe.result != ok or duration.seconds >= 5s” }
  • Using natural language: { action = exclude, where = “scribe.result is ok and duration.seconds is at most 50ms” }
PropertyValue
Default[##, Suppress, ...]
monitoring.log.rules = [ ## Suppress successful LDAP operations under 500ms (higher threshold for DB-backed searches)
{ action = exclude, name = "LDAP.*", where = "scribe.result=ok duration.seconds<=500ms" }
## Suppress fast successful operations (primary use case for noise reduction)
{ action = exclude, where = "scribe.result=ok duration.seconds<=50ms" }
## Suppress fast client errors (4xx) - expected from chaos/load testing
## These are client mistakes, not server issues; only log if slow (potential perf issue)
{ action = exclude, where = "failure.kind=INVALID_ARGUMENT duration.seconds<=100ms" }
{ action = exclude, where = "failure.kind=NOT_FOUND duration.seconds<=100ms" }
## Default noise gates for internal operations
{ action = exclude, name = "Transcription.WorkItem", where = "duration.seconds<=500ms" }
{ action = exclude, name = "Hints.ExplainSampling", where = "duration.seconds<=10s" }
{ action = exclude, name = "Hints.*", where = "duration.seconds<=100ms" }
{ action = exclude, name = "Metrics.*", where = "duration.seconds<=500ms" }
{ action = exclude, name = "*.ShutDown", where = "duration.seconds<=5s" }
{ action = exclude, name = "*.StartUp", where = "duration.seconds<=5s" }
{ action = exclude, name = "Ingest.AsyncSearch.*", where = "duration.seconds<=5s" }
]

Random sampling for operations that don’t match any rule. 0 = never log, 100 = always log (default)

PropertyValue
Default100
OverrideSCRIBE_LOG_SAMPLE_RATE (optional)
monitoring.log.sample-rate = 100
monitoring.log.sample-rate = ${?SCRIBE_LOG_SAMPLE_RATE}

Shutdown Noise Suppression

Controls whether expected shutdown-time errors (connection refused, interrupts, cancellation, pool closed) are downgraded from ERROR to DEBUG.

Priority: SCRIBE_LOG_SHUTDOWN_NOISE > config

Mode: off | on | auto

off - Never suppress; always log as ERROR (useful for debugging shutdown issues) on - Suppress known shutdown noise during shutdown phases auto - Suppress in test and production modes; show in development (default) AUTO behavior uses app.mode to determine environment:

  • Dev mode (app.mode=dev/development/local) → off (show errors for debugging)
  • Test mode (app.mode=test) → on (quiet test output)
  • Production mode → on (suppress expected teardown noise)
PropertyValue
Defaultauto
OverrideSCRIBE_LOG_SHUTDOWN_NOISE (optional)
monitoring.log.shutdown-noise.mode = auto
monitoring.log.shutdown-noise.mode = ${?SCRIBE_LOG_SHUTDOWN_NOISE}

MCP (Model Context Protocol) observe channel

Exposes operational insights to AI coding assistants via:

  • Tools: status, health, doctor, signals, channels, stats
  • Prompts: observe guide, troubleshooting-ops
  • Resources: observe OpenAPI spec, resolved config

The endpoint is fixed at /observe/mcp. IMPORTANT: MCP requires HTTP/1.1 for Server-Sent Events (SSE) transport. HTTP-2-only sockets will not work. Ensure the bound socket has HTTP/1.1 enabled:

http.protocols.http-1-1.enabled = true (default)

Authentication for /observe/mcp endpoint. Inherits from monitoring.auth (which inherits from auth).

PropertyValue
OverrideSCRIBE_MONITORING_MCP_AUTH_ENABLED (optional)
monitoring.mcp.auth.enabled = ${?SCRIBE_MONITORING_MCP_AUTH_ENABLED}

PropertyValue
OverrideSCRIBE_MONITORING_MCP_AUTH_METHODS (optional)
monitoring.mcp.auth.methods = ${?SCRIBE_MONITORING_MCP_AUTH_METHODS}

Enable the MCP observe channel

Priority: SCRIBE_MONITORING_MCP_ENABLED > config

PropertyValue
Defaultfalse
OverrideSCRIBE_MONITORING_MCP_ENABLED (optional)
monitoring.mcp.enabled = false
monitoring.mcp.enabled = ${?SCRIBE_MONITORING_MCP_ENABLED}

Socket reference(s) for /observe/mcp endpoint. Can be a single name, comma-separated list, or HOCON list. Falls back to monitoring.socket if unset.

Priority: SCRIBE_MONITORING_MCP_SOCKET > config

Examples:

socket = "admin" # Single socket
socket = "admin, public" # Comma-separated
socket = ["admin", "public"] # HOCON list
PropertyValue
OverrideSCRIBE_MONITORING_MCP_SOCKET (optional)
monitoring.mcp.socket = ${?SCRIBE_MONITORING_MCP_SOCKET}

Metrics configuration (OTLP push, independent of Prometheus scrape)

Enable OTLP metrics push export

Default: false (Prometheus at /metrics is the primary export)

Priority: SCRIBE_METRICS_ENABLED > config

PropertyValue
Defaultfalse
OverrideSCRIBE_METRICS_ENABLED (optional)
monitoring.metrics.enabled = false
monitoring.metrics.enabled = ${?SCRIBE_METRICS_ENABLED}

OTLP endpoint for metrics

Priority: SCRIBE_METRICS_ENDPOINT > OTEL_EXPORTER_OTLP_METRICS_ENDPOINT > OTEL_EXPORTER_OTLP_ENDPOINT > config

PropertyValue
Default"http://localhost:4317"
OverrideSCRIBE_METRICS_ENDPOINT (optional) > OTEL_EXPORTER_OTLP_METRICS_ENDPOINT (standard) > OTEL_EXPORTER_OTLP_ENDPOINT (standard)
monitoring.metrics.endpoint = "http://localhost:4317"
monitoring.metrics.endpoint = ${?OTEL_EXPORTER_OTLP_ENDPOINT}
monitoring.metrics.endpoint = ${?OTEL_EXPORTER_OTLP_METRICS_ENDPOINT}
monitoring.metrics.endpoint = ${?SCRIBE_METRICS_ENDPOINT}

Export interval for OTLP push

Priority: SCRIBE_METRICS_INTERVAL > config

PropertyValue
Default60 seconds
OverrideSCRIBE_METRICS_INTERVAL (optional)
monitoring.metrics.interval = 60 seconds
monitoring.metrics.interval = ${?SCRIBE_METRICS_INTERVAL}

  • When protocol is “http/protobuf” and the endpoint still uses the default gRPC port (4317),

Identity Scribe will automatically rewrite it to port 4318.

  • For “http/protobuf”, if the endpoint does not already include “/v1/metrics”, it will be appended.

Protocol: “grpc” (port 4317) or “http/protobuf” (port 4318)

Priority: SCRIBE_METRICS_PROTOCOL > OTEL_EXPORTER_OTLP_METRICS_PROTOCOL > OTEL_EXPORTER_OTLP_PROTOCOL > config

PropertyValue
Default"grpc"
OverrideSCRIBE_METRICS_PROTOCOL (optional) > OTEL_EXPORTER_OTLP_METRICS_PROTOCOL (standard) > OTEL_EXPORTER_OTLP_PROTOCOL (standard)
monitoring.metrics.protocol = "grpc"
monitoring.metrics.protocol = ${?OTEL_EXPORTER_OTLP_PROTOCOL}
monitoring.metrics.protocol = ${?OTEL_EXPORTER_OTLP_METRICS_PROTOCOL}
monitoring.metrics.protocol = ${?SCRIBE_METRICS_PROTOCOL}

Observe endpoints (/observe, /observe/*) Provides status, doctor, pressure, indexes, hints, signatures, stats, etc.

Authentication for /observe/* endpoints. Inherits from monitoring.auth (which inherits from auth). Recommended: require auth in production (exposes sensitive operational data).

PropertyValue
OverrideSCRIBE_OBSERVE_AUTH_ENABLED (optional)
monitoring.observe.auth.enabled = ${?SCRIBE_OBSERVE_AUTH_ENABLED}

PropertyValue
OverrideSCRIBE_OBSERVE_AUTH_METHODS (optional)
monitoring.observe.auth.methods = ${?SCRIBE_OBSERVE_AUTH_METHODS}

Enable /observe/* endpoints.

Priority: SCRIBE_OBSERVE_ENABLED > config

PropertyValue
Defaulttrue
OverrideSCRIBE_OBSERVE_ENABLED (optional)
monitoring.observe.enabled = true
monitoring.observe.enabled = ${?SCRIBE_OBSERVE_ENABLED}

Socket for /observe/* endpoints. Falls back to monitoring.socket if unset.

Priority: SCRIBE_OBSERVE_SOCKET > config

PropertyValue
OverrideSCRIBE_OBSERVE_SOCKET (optional)
monitoring.observe.socket = ${?SCRIBE_OBSERVE_SOCKET}

OpenAPI UI (Scalar API Reference) for /observe endpoints Inherits all defaults from openapi.ui. Channel-specific env vars can override individual settings.

PropertyValue
Default${openapi.ui}

Inherits from: openapi.ui

monitoring.observe.ui = ${openapi.ui}

OpenAPI UI (Scalar API Reference) for /observe endpoints Inherits all defaults from openapi.ui. Channel-specific env vars can override individual settings.

External asset (CDN override)

PropertyValue
OverrideSCRIBE_OBSERVE_UI_ASSET_SRI (optional)
monitoring.observe.ui.asset.sri = ${?SCRIBE_OBSERVE_UI_ASSET_SRI}

PropertyValue
OverrideSCRIBE_OBSERVE_UI_ASSET_URL (optional)
monitoring.observe.ui.asset.url = ${?SCRIBE_OBSERVE_UI_ASSET_URL}

PropertyValue
OverrideSCRIBE_OBSERVE_UI_DARK_MODE (optional)
monitoring.observe.ui.dark-mode = ${?SCRIBE_OBSERVE_UI_DARK_MODE}

Channel-specific env var overrides

Priority: SCRIBE_OBSERVE_UI_* > openapi.ui.*

PropertyValue
OverrideSCRIBE_OBSERVE_UI_ENABLED (optional)
monitoring.observe.ui.enabled = ${?SCRIBE_OBSERVE_UI_ENABLED}

monitoring.observe.ui.force-dark-mode-state

Section titled “monitoring.observe.ui.force-dark-mode-state”
PropertyValue
OverrideSCRIBE_OBSERVE_UI_FORCE_DARK_MODE_STATE (optional)
monitoring.observe.ui.force-dark-mode-state = ${?SCRIBE_OBSERVE_UI_FORCE_DARK_MODE_STATE}

monitoring.observe.ui.hide-dark-mode-toggle

Section titled “monitoring.observe.ui.hide-dark-mode-toggle”
PropertyValue
OverrideSCRIBE_OBSERVE_UI_HIDE_DARK_MODE_TOGGLE (optional)
monitoring.observe.ui.hide-dark-mode-toggle = ${?SCRIBE_OBSERVE_UI_HIDE_DARK_MODE_TOGGLE}

monitoring.observe.ui.hide-test-request-button

Section titled “monitoring.observe.ui.hide-test-request-button”
PropertyValue
OverrideSCRIBE_OBSERVE_UI_HIDE_TEST_REQUEST_BUTTON (optional)
monitoring.observe.ui.hide-test-request-button = ${?SCRIBE_OBSERVE_UI_HIDE_TEST_REQUEST_BUTTON}

PropertyValue
OverrideSCRIBE_OBSERVE_UI_LAYOUT (optional)
monitoring.observe.ui.layout = ${?SCRIBE_OBSERVE_UI_LAYOUT}

PropertyValue
OverrideSCRIBE_OBSERVE_UI_SEARCH_HOT_KEY (optional)
monitoring.observe.ui.search-hot-key = ${?SCRIBE_OBSERVE_UI_SEARCH_HOT_KEY}

monitoring.observe.ui.show-developer-tools

Section titled “monitoring.observe.ui.show-developer-tools”
PropertyValue
OverrideSCRIBE_OBSERVE_UI_SHOW_DEVELOPER_TOOLS (optional)
monitoring.observe.ui.show-developer-tools = ${?SCRIBE_OBSERVE_UI_SHOW_DEVELOPER_TOOLS}

PropertyValue
OverrideSCRIBE_OBSERVE_UI_SHOW_SIDEBAR (optional)
monitoring.observe.ui.show-sidebar = ${?SCRIBE_OBSERVE_UI_SHOW_SIDEBAR}

PropertyValue
OverrideSCRIBE_OBSERVE_UI_THEME (optional)
monitoring.observe.ui.theme = ${?SCRIBE_OBSERVE_UI_THEME}

Prometheus scrape endpoint (/metrics)

Authentication for /metrics endpoint.

Default: allow anonymous access (Prometheus typically scrapes without auth).

Prometheus can authenticate if configured with authorization.credentials or basic_auth in the scrape config. Override to require auth:

monitoring.prometheus.auth.rules = [
{ action = deny, where = "subject.anonymous = true" }
{ action = allow }
]
PropertyValue
OverrideSCRIBE_PROMETHEUS_AUTH_METHODS (optional)
monitoring.prometheus.auth.methods = ${?SCRIBE_PROMETHEUS_AUTH_METHODS}

Allow anonymous access for Prometheus scraping.

PropertyValue
Default[{ id = "prometheus-public", action = allow }]
monitoring.prometheus.auth.rules = [{ id = "prometheus-public", action = allow }]

Enable Prometheus scrape endpoint at /metrics

Priority: SCRIBE_PROMETHEUS_ENABLED > config

PropertyValue
Defaulttrue
OverrideSCRIBE_PROMETHEUS_ENABLED (optional)
monitoring.prometheus.enabled = true
monitoring.prometheus.enabled = ${?SCRIBE_PROMETHEUS_ENABLED}

The interval at which metrics are scraped internally. Background scraping decouples collection latency from HTTP response time. Common Prometheus scrape interval is 60 seconds (default). Default is 30 seconds (1/2 of Prometheus default).

Priority: SCRIBE_PROMETHEUS_SCRAPE_INTERVAL > config

PropertyValue
Default30 seconds
OverrideSCRIBE_PROMETHEUS_SCRAPE_INTERVAL (optional)
monitoring.prometheus.scrape-interval = 30 seconds
monitoring.prometheus.scrape-interval = ${?SCRIBE_PROMETHEUS_SCRAPE_INTERVAL}

Socket for /metrics endpoint. Falls back to monitoring.socket if unset.

Priority: SCRIBE_PROMETHEUS_SOCKET > config

PropertyValue
OverrideSCRIBE_PROMETHEUS_SOCKET (optional)
monitoring.prometheus.socket = ${?SCRIBE_PROMETHEUS_SOCKET}

OpenTelemetry (OTel) configuration

Environment variable overrides (highest priority wins):

  • Priority: SCRIBE_* > OTEL_* > config file > default
  • HOCON substitution: later values override earlier ones

Enablement semantics:

  • OTel SDK is automatically enabled when any metric export is active:
monitoring.prometheus.enabled=true OR monitoring.metrics.enabled=true
  • OTEL_SDK_DISABLED=true disables OTLP exporters but allows Prometheus pull
  • To disable tracing: set monitoring.traces.enabled=false

Resource attributes for all telemetry

Deployment environment (e.g., “production”, “staging”, “dev”)

Priority: SCRIBE_RESOURCE_ENVIRONMENT > OTEL_DEPLOYMENT_ENVIRONMENT > config

PropertyValue
Defaultnull
OverrideSCRIBE_RESOURCE_ENVIRONMENT (optional) > OTEL_DEPLOYMENT_ENVIRONMENT (standard)
monitoring.resource.environment = ${?OTEL_DEPLOYMENT_ENVIRONMENT}
monitoring.resource.environment = ${?SCRIBE_RESOURCE_ENVIRONMENT}

Service name

Priority: SCRIBE_RESOURCE_SERVICE_NAME > OTEL_SERVICE_NAME > config

PropertyValue
Default"identity-scribe"
OverrideSCRIBE_RESOURCE_SERVICE_NAME (optional) > OTEL_SERVICE_NAME (standard)
monitoring.resource.service-name = "identity-scribe"
monitoring.resource.service-name = ${?OTEL_SERVICE_NAME}
monitoring.resource.service-name = ${?SCRIBE_RESOURCE_SERVICE_NAME}

Service version (auto-detected from Implementation-Version if unset)

Priority: SCRIBE_RESOURCE_SERVICE_VERSION > OTEL_SERVICE_VERSION > config

PropertyValue
Defaultnull
OverrideSCRIBE_RESOURCE_SERVICE_VERSION (optional) > OTEL_SERVICE_VERSION (standard)
monitoring.resource.service-version = ${?OTEL_SERVICE_VERSION}
monitoring.resource.service-version = ${?SCRIBE_RESOURCE_SERVICE_VERSION}

Golden signals thresholds for health status evaluation. Used by /observe/doctor and /observe/signals to determine degraded status. Thresholds are evaluated with volume gates (min-requests) to prevent flapping at low traffic. All rates are percentages. Per-channel overrides: Channels can inherit and override these thresholds using HOCON inheritance. Example:

channels.ldap {
signals = ${monitoring.signals} {
latency-p99 { degraded = 0.3, critical = 1.0 }
}
}

Client error rate threshold (info-only, triggers NOISY not DEGRADED). High client errors don’t indicate server problems, but may need attention.

monitoring.signals.client-error-rate.noisy

Section titled “monitoring.signals.client-error-rate.noisy”
PropertyValue
Default10.0 # >= 10% client errors → NOISY
monitoring.signals.client-error-rate.noisy = 10.0 # >= 10% client errors → NOISY

Latency p99 thresholds in seconds. Used for absolute latency threshold checks.

PropertyValue
Default2.0 # >= 2s → CRITICAL
monitoring.signals.latency-p99.critical = 2.0 # >= 2s → CRITICAL

PropertyValue
Default0.5 # >= 500ms → DEGRADED
monitoring.signals.latency-p99.degraded = 0.5 # >= 500ms → DEGRADED

Minimum requests in 5m window to evaluate thresholds. Below this, status remains HEALTHY regardless of rates.

Priority: SCRIBE_SIGNALS_MIN_REQUESTS > config

PropertyValue
Default200
OverrideSCRIBE_SIGNALS_MIN_REQUESTS (optional)
monitoring.signals.min-requests = 200
monitoring.signals.min-requests = ${?SCRIBE_SIGNALS_MIN_REQUESTS}

Saturation thresholds (global only, not per-channel).


Server error rate thresholds (primary degraded signal). These are percentages of total requests.

monitoring.signals.server-error-rate.critical

Section titled “monitoring.signals.server-error-rate.critical”
PropertyValue
Default2.0 # >= 2.0% → CRITICAL
monitoring.signals.server-error-rate.critical = 2.0 # >= 2.0% → CRITICAL

monitoring.signals.server-error-rate.degraded

Section titled “monitoring.signals.server-error-rate.degraded”
PropertyValue
Default0.5 # >= 0.5% → DEGRADED
monitoring.signals.server-error-rate.degraded = 0.5 # >= 0.5% → DEGRADED

Timeout rate thresholds.

PropertyValue
Default1.0 # >= 1.0% → CRITICAL
monitoring.signals.timeout-rate.critical = 1.0 # >= 1.0% → CRITICAL

PropertyValue
Default0.2 # >= 0.2% → DEGRADED
monitoring.signals.timeout-rate.degraded = 0.2 # >= 0.2% → DEGRADED

Socket reference for monitoring endpoints. Use a named socket from http.sockets.* or omit for @default. If unset (no env var, key omitted, or value is null), defaults to @default.

Priority: SCRIBE_MONITORING_SOCKET > config

PropertyValue
OverrideSCRIBE_MONITORING_SOCKET (optional)
monitoring.socket = ${?SCRIBE_MONITORING_SOCKET}

Trace exporter configuration

Enable trace export

Default: false (opt-in for tracing)

Priority: SCRIBE_TRACES_ENABLED > config

PropertyValue
Defaultfalse
OverrideSCRIBE_TRACES_ENABLED (optional)
monitoring.traces.enabled = false
monitoring.traces.enabled = ${?SCRIBE_TRACES_ENABLED}

OTLP endpoint for traces

Priority: SCRIBE_TRACES_ENDPOINT > OTEL_EXPORTER_OTLP_TRACES_ENDPOINT > OTEL_EXPORTER_OTLP_ENDPOINT > MONITORING_TRACING_OTLP_URL > config

PropertyValue
Default"http://localhost:4317"
OverrideSCRIBE_TRACES_ENDPOINT (optional) > OTEL_EXPORTER_OTLP_TRACES_ENDPOINT (standard) > OTEL_EXPORTER_OTLP_ENDPOINT (standard) > MONITORING_TRACING_OTLP_URL (standard)
monitoring.traces.endpoint = "http://localhost:4317"
monitoring.traces.endpoint = ${?MONITORING_TRACING_OTLP_URL}
monitoring.traces.endpoint = ${?OTEL_EXPORTER_OTLP_ENDPOINT}
monitoring.traces.endpoint = ${?OTEL_EXPORTER_OTLP_TRACES_ENDPOINT}
monitoring.traces.endpoint = ${?SCRIBE_TRACES_ENDPOINT}

Headers for OTLP trace export (e.g. auth)

Comma-separated key=value pairs. Examples:

Bearer: authorization=Bearer <token>

Basic: authorization=Basic <base64(user:pass)>

Priority: SCRIBE_TRACES_HEADERS > MONITORING_TRACING_HEADERS > OTEL_EXPORTER_OTLP_TRACES_HEADERS > OTEL_EXPORTER_OTLP_HEADERS

PropertyValue
OverrideSCRIBE_TRACES_HEADERS (optional) > MONITORING_TRACING_HEADERS (standard) > OTEL_EXPORTER_OTLP_TRACES_HEADERS (standard) > OTEL_EXPORTER_OTLP_HEADERS (standard)
monitoring.traces.headers = ${?OTEL_EXPORTER_OTLP_HEADERS}
monitoring.traces.headers = ${?OTEL_EXPORTER_OTLP_TRACES_HEADERS}
monitoring.traces.headers = ${?MONITORING_TRACING_HEADERS}
monitoring.traces.headers = ${?SCRIBE_TRACES_HEADERS}

  • When protocol is “http/protobuf” and the endpoint still uses the default gRPC port (4317),

Identity Scribe will automatically rewrite it to port 4318.

  • For “http/protobuf”, if the endpoint does not already include “/v1/traces”, it will be appended.

Protocol: “grpc” (port 4317) or “http/protobuf” (port 4318)

Priority: SCRIBE_TRACES_PROTOCOL > OTEL_EXPORTER_OTLP_TRACES_PROTOCOL > OTEL_EXPORTER_OTLP_PROTOCOL > config

PropertyValue
Default"grpc"
OverrideSCRIBE_TRACES_PROTOCOL (optional) > OTEL_EXPORTER_OTLP_TRACES_PROTOCOL (standard) > OTEL_EXPORTER_OTLP_PROTOCOL (standard)
monitoring.traces.protocol = "grpc"
monitoring.traces.protocol = ${?OTEL_EXPORTER_OTLP_PROTOCOL}
monitoring.traces.protocol = ${?OTEL_EXPORTER_OTLP_TRACES_PROTOCOL}
monitoring.traces.protocol = ${?SCRIBE_TRACES_PROTOCOL}

Export timeout

Priority: SCRIBE_TRACES_TIMEOUT > OTEL_EXPORTER_OTLP_TRACES_TIMEOUT > OTEL_EXPORTER_OTLP_TIMEOUT > config

PropertyValue
Default10 seconds
OverrideSCRIBE_TRACES_TIMEOUT (optional) > OTEL_EXPORTER_OTLP_TRACES_TIMEOUT (standard) > OTEL_EXPORTER_OTLP_TIMEOUT (standard)
monitoring.traces.timeout = 10 seconds
monitoring.traces.timeout = ${?OTEL_EXPORTER_OTLP_TIMEOUT}
monitoring.traces.timeout = ${?OTEL_EXPORTER_OTLP_TRACES_TIMEOUT}
monitoring.traces.timeout = ${?SCRIBE_TRACES_TIMEOUT}