Monitoring
Monitoring endpoints and telemetry configuration. Includes Prometheus metrics, health probes, and OpenTelemetry tracing.
monitoring.enabled
Section titled “monitoring.enabled”Enable the monitoring server.
| Property | Value |
|---|---|
| Default | true |
monitoring.enabled = truemonitoring.exemplars
Section titled “monitoring.exemplars”Exemplars (metrics → traces) for histogram metrics. Exemplars attach trace context (traceId, spanId) to histogram buckets, enabling “click to trace” in Grafana when using Prometheus + Tempo. Values: off - No exemplars recorded (default, minimal overhead) traceBased - Record exemplars only when an active sampled span exists alwaysOn - Record all measurements as exemplar candidates IMPORTANT:
- traceBased requires tracing to be enabled (monitoring.traces.enabled=true)
- Exemplars require a metrics exporter to be enabled:
monitoring.prometheus.enabled=true OR monitoring.metrics.enabled=trueExternal requirements (operator responsibility):
- Prometheus: —enable-feature=exemplar-storage
- Grafana: Configure Tempo datasource with exemplar linking
Priority: SCRIBE_METRICS_EXEMPLARS > OTEL_METRICS_EXEMPLAR_FILTER > config
| Property | Value |
|---|---|
| Default | "off" |
| Override | SCRIBE_METRICS_EXEMPLARS (optional) > OTEL_METRICS_EXEMPLAR_FILTER (standard) |
monitoring.exemplars = "off"monitoring.exemplars = ${?OTEL_METRICS_EXEMPLAR_FILTER}monitoring.exemplars = ${?SCRIBE_METRICS_EXEMPLARS}monitoring.health
Section titled “monitoring.health”Health probe endpoints (/livez, /readyz, /startedz, /healthz) Kubernetes-style health probes for liveness, readiness, and startup checks.
monitoring.health.enabled
Section titled “monitoring.health.enabled”Enable health probe endpoints.
Priority: SCRIBE_HEALTH_ENABLED > config
| Property | Value |
|---|---|
| Default | true |
| Override | SCRIBE_HEALTH_ENABLED (optional) |
monitoring.health.enabled = truemonitoring.health.enabled = ${?SCRIBE_HEALTH_ENABLED}monitoring.health.socket
Section titled “monitoring.health.socket”Socket for health probe endpoints. Falls back to monitoring.socket if unset.
Priority: SCRIBE_HEALTH_SOCKET > config
| Property | Value |
|---|---|
| Override | SCRIBE_HEALTH_SOCKET (optional) |
monitoring.health.socket = ${?SCRIBE_HEALTH_SOCKET}monitoring.hints
Section titled “monitoring.hints”Configuration for the operational hint engine. The hint engine detects performance issues and optimization opportunities during LDAP search execution, surfacing actionable recommendations via metrics and structured logs.
Detector modules:
- Equality/Range Coverage: Detects missing valueMatch indexes
- Sort/VLV Readiness: Detects missing sortable indexes and non-deterministic ordering
- Control Fallback: Detects when LDAP controls force delegation or rewrites
- Type Scope Guidance: Recommends explicit entryType constraints for better performance
- Startup Validation: Validates indices configuration at application startup
monitoring.hints.cache
Section titled “monitoring.hints.cache”Query signature cache configuration. Controls TTL and size for the query signature cache used by EXPLAIN sampling. This cache stores execution counts, last EXPLAIN timestamps, sequential-scan flags, and SQL text needed for future analysis.
monitoring.hints.cache.size
Section titled “monitoring.hints.cache.size”Maximum number of cached query signatures. When cache is full, least-recently-used entries are evicted.
Priority: SCRIBE_HINTS_CACHE_SIZE > config
Default: 500
| Property | Value |
|---|---|
| Default | 500 |
| Override | SCRIBE_HINTS_CACHE_SIZE (optional) |
monitoring.hints.cache.size = 500monitoring.hints.cache.size = ${?SCRIBE_HINTS_CACHE_SIZE}monitoring.hints.cache.ttl
Section titled “monitoring.hints.cache.ttl”Time-to-live for signature cache entries. Signatures older than this TTL will be re-EXPLAINed on next execution. Cache uses expire-after-access, so hot signatures stay longer.
Priority: SCRIBE_HINTS_CACHE_TTL > config
Default: 1 hour
| Property | Value |
|---|---|
| Default | 1 hour |
| Override | SCRIBE_HINTS_CACHE_TTL (optional) |
monitoring.hints.cache.ttl = 1 hourmonitoring.hints.cache.ttl = ${?SCRIBE_HINTS_CACHE_TTL}monitoring.hints.detectors
Section titled “monitoring.hints.detectors”Per-detector module toggles.
Each detector can be individually enabled or disabled. Defaults to enabled when monitoring.hints.enabled is true.
monitoring.hints.detectors.config-validation
Section titled “monitoring.hints.detectors.config-validation”Enable/disable startup configuration validation. Validates that all indexed attributes exist in observed attributes and checks virtual attributes for entryType in filters. Emits CONFIG hints for misconfigurations at startup.
| Property | Value |
|---|---|
| Default | true |
monitoring.hints.detectors.config-validation = truemonitoring.hints.detectors.control-fallback
Section titled “monitoring.hints.detectors.control-fallback”Enable/disable control fallback detection. Emits hints when LDAP controls (e.g., VLV assertion values, unsupported sort attributes) force delegation or in-memory rewrites.
| Property | Value |
|---|---|
| Default | true |
monitoring.hints.detectors.control-fallback = truemonitoring.hints.detectors.equality-index
Section titled “monitoring.hints.detectors.equality-index”Enable/disable equality and range index coverage detection. Emits hints when equality (=), greaterOrEqual (>=), or lessOrEqual (<=) filters are used on attributes missing value-match (truncation) indexes.
| Property | Value |
|---|---|
| Default | true |
monitoring.hints.detectors.equality-index = truemonitoring.hints.detectors.sort-plan
Section titled “monitoring.hints.detectors.sort-plan”Enable/disable sort and VLV readiness detection. Emits hints when sortable indexes are missing or sort order is non-deterministic (multi-valued attributes without tie-breaker).
| Property | Value |
|---|---|
| Default | true |
monitoring.hints.detectors.sort-plan = truemonitoring.hints.detectors.type-scope-guidance
Section titled “monitoring.hints.detectors.type-scope-guidance”Enable/disable type scope guidance. Emits informational hints recommending explicit (entryType=X) constraints when types are known from DN context but filters lack entryType.
| Property | Value |
|---|---|
| Default | true |
monitoring.hints.detectors.type-scope-guidance = truemonitoring.hints.enabled
Section titled “monitoring.hints.enabled”Enable or disable hint collection and emission. When disabled, no hints are collected and no metrics are incremented.
Priority: SCRIBE_HINTS_ENABLED > config
| Property | Value |
|---|---|
| Default | true |
| Override | SCRIBE_HINTS_ENABLED (optional) |
monitoring.hints.enabled = truemonitoring.hints.enabled = ${?SCRIBE_HINTS_ENABLED}monitoring.hints.explain
Section titled “monitoring.hints.explain”EXPLAIN-based runtime analysis and query signature tracking. Enables signature-based EXPLAIN sampling to detect sequential scans and correlate query patterns with performance issues.
monitoring.hints.explain.enabled
Section titled “monitoring.hints.explain.enabled”Enable or disable EXPLAIN sampling and query signature tracking. When disabled, no signatures are computed, no EXPLAIN plans are analyzed, and /observe/signatures endpoint is unavailable. Defaults to enabled when monitoring.hints.enabled is true.
Priority: SCRIBE_HINTS_EXPLAIN_ENABLED > config
| Property | Value |
|---|---|
| Override | SCRIBE_HINTS_EXPLAIN_ENABLED (optional) |
monitoring.hints.explain.enabled = ${?SCRIBE_HINTS_EXPLAIN_ENABLED}monitoring.hints.explain.options
Section titled “monitoring.hints.explain.options”EXPLAIN command options for enhanced plan analysis. Options control what information is included in EXPLAIN output:
- ANALYZE: Execute query and show actual runtime statistics (actual rows, actual time)
- COSTS: Include estimated costs (enabled by default in PostgreSQL)
- VERBOSE: Include additional details (output columns, schema-qualified names)
- BUFFERS: Include buffer usage statistics (requires ANALYZE)
Default: [] (empty, uses default PostgreSQL EXPLAIN behavior)
Example: [“ANALYZE”, “BUFFERS”] for runtime stats and buffer usage
| Property | Value |
|---|---|
| Default | [] |
monitoring.hints.explain.options = []monitoring.hints.explain.sample-every
Section titled “monitoring.hints.explain.sample-every”Re-EXPLAIN every N query executions for the same signature. Used to detect plan drift and refresh analysis for frequently-executed queries.
Priority: SCRIBE_HINTS_EXPLAIN_SAMPLE_EVERY > config
Default: 1000
| Property | Value |
|---|---|
| Default | 1000 |
| Override | SCRIBE_HINTS_EXPLAIN_SAMPLE_EVERY (optional) |
monitoring.hints.explain.sample-every = 1000monitoring.hints.explain.sample-every = ${?SCRIBE_HINTS_EXPLAIN_SAMPLE_EVERY}monitoring.hints.explain.slow-query-duration
Section titled “monitoring.hints.explain.slow-query-duration”Slow query threshold duration. Queries exceeding this duration trigger high-priority EXPLAIN sampling (non-blocking, background execution). This value also controls fast-query hint filtering: sequential scan hints without actionable index recommendations are suppressed when execution time is below sqrt(slow-query-duration) (~31ms at default 1s). This reduces noise from fast queries on small tables that don’t benefit from indexing.
| slow-query-duration | Fast hint threshold |
|---|---|
| 100ms | ~10ms |
| 500ms | ~22ms |
| 1000ms (default) | ~31ms |
| 5000ms | ~70ms |
Priority: SCRIBE_HINTS_EXPLAIN_SLOW_QUERY_DURATION > config
Default: 1000ms
| Property | Value |
|---|---|
| Default | 1000ms |
| Override | SCRIBE_HINTS_EXPLAIN_SLOW_QUERY_DURATION (optional) |
monitoring.hints.explain.slow-query-duration = 1000msmonitoring.hints.explain.slow-query-duration = ${?SCRIBE_HINTS_EXPLAIN_SLOW_QUERY_DURATION}monitoring.hints.persistence
Section titled “monitoring.hints.persistence”Hint persistence configuration.
Persists hints to PostgreSQL for audit and Insights visualization. When enabled, both hint signals and query signatures are persisted. Query signatures store the full normalized query structure (filter, sort, scope, types, controls) for debugging and correlation with hints.
monitoring.hints.persistence.batch-interval
Section titled “monitoring.hints.persistence.batch-interval”Maximum wait time between batch flushes. Batches are flushed when full or when this interval elapses.
Priority: SCRIBE_HINTS_PERSISTENCE_BATCH_INTERVAL > config
Default: 5 seconds
| Property | Value |
|---|---|
| Default | 5 seconds |
| Override | SCRIBE_HINTS_PERSISTENCE_BATCH_INTERVAL (optional) |
monitoring.hints.persistence.batch-interval = 5 secondsmonitoring.hints.persistence.batch-interval = ${?SCRIBE_HINTS_PERSISTENCE_BATCH_INTERVAL}monitoring.hints.persistence.batch-size
Section titled “monitoring.hints.persistence.batch-size”Batch size for database inserts. Hints and signatures are batched before writing to reduce database load.
Priority: SCRIBE_HINTS_PERSISTENCE_BATCH_SIZE > config
Default: 100
| Property | Value |
|---|---|
| Default | 100 |
| Override | SCRIBE_HINTS_PERSISTENCE_BATCH_SIZE (optional) |
monitoring.hints.persistence.batch-size = 100monitoring.hints.persistence.batch-size = ${?SCRIBE_HINTS_PERSISTENCE_BATCH_SIZE}monitoring.hints.persistence.clear-on-startup
Section titled “monitoring.hints.persistence.clear-on-startup”Clear persisted hints on startup. When true, truncates hint_signals and hint_signatures tables at startup. Ensures hints reflect current configuration rather than stale state from previous runs. Useful when configuration changes (e.g., new indexes) make old hints obsolete.
Priority: SCRIBE_HINTS_PERSISTENCE_CLEAR_ON_STARTUP > config
Default: true
| Property | Value |
|---|---|
| Default | true |
| Override | SCRIBE_HINTS_PERSISTENCE_CLEAR_ON_STARTUP (optional) |
monitoring.hints.persistence.clear-on-startup = truemonitoring.hints.persistence.clear-on-startup = ${?SCRIBE_HINTS_PERSISTENCE_CLEAR_ON_STARTUP}monitoring.hints.persistence.enabled
Section titled “monitoring.hints.persistence.enabled”Enable or disable hint persistence. When disabled, hints are only emitted via metrics and logs.
Priority: SCRIBE_HINTS_PERSISTENCE_ENABLED > config
Default: false (opt-in)
| Property | Value |
|---|---|
| Default | false |
| Override | SCRIBE_HINTS_PERSISTENCE_ENABLED (optional) |
monitoring.hints.persistence.enabled = falsemonitoring.hints.persistence.enabled = ${?SCRIBE_HINTS_PERSISTENCE_ENABLED}monitoring.hints.persistence.max-queue
Section titled “monitoring.hints.persistence.max-queue”Maximum size of the in-memory persistence queue. When queue is full, newest hints are dropped (FIFO eviction).
Priority: SCRIBE_HINTS_PERSISTENCE_MAX_QUEUE > config
Default: 5000
| Property | Value |
|---|---|
| Default | 5000 |
| Override | SCRIBE_HINTS_PERSISTENCE_MAX_QUEUE (optional) |
monitoring.hints.persistence.max-queue = 5000monitoring.hints.persistence.max-queue = ${?SCRIBE_HINTS_PERSISTENCE_MAX_QUEUE}monitoring.hints.persistence.max-rows
Section titled “monitoring.hints.persistence.max-rows”Maximum number of hint rows to retain in the database (ring-buffer). When this limit is reached, oldest hints are evicted to make room for new ones.
Priority: SCRIBE_HINTS_PERSISTENCE_MAX_ROWS > config
Default: 50000
| Property | Value |
|---|---|
| Default | 50000 |
| Override | SCRIBE_HINTS_PERSISTENCE_MAX_ROWS (optional) |
monitoring.hints.persistence.max-rows = 50000monitoring.hints.persistence.max-rows = ${?SCRIBE_HINTS_PERSISTENCE_MAX_ROWS}monitoring.hints.persistence.rules
Section titled “monitoring.hints.persistence.rules”Rules for filtering which hints are persisted. Rules are evaluated in order - first matching rule wins. Hints that match an “exclude” rule are still emitted via metrics and logs, but not stored in the database. Rule syntax (same as log.rules): { action = include|exclude, where = “filter” } Available attributes for filtering:
- hint-type: PARTIAL_MATCH_INDEX, EQUALITY_INDEX, SORT_PLAN,
CONTROL_FALLBACK, EXPLAIN_SEQ_SCAN, TYPE_SCOPE_GUIDANCE, CONFIG
- severity: INFO, WARNING, ERROR
- attribute: the attribute name (e.g., “cn”, “mail”), null if not applicable
- entry-type: first entry type if multiple (e.g., “inetOrgPerson”), null if none
- base-dn: the base DN of the query (e.g., “ou=users,dc=example,dc=com”), null if none
Examples:
Default: [] (empty - persist all hints)
| Property | Value |
|---|---|
| Default | [] |
monitoring.hints.persistence.rules = []monitoring.hints.persistence.ttl
Section titled “monitoring.hints.persistence.ttl”Time-to-live for persisted hints and signatures. Hints and signatures older than this TTL are automatically purged.
Priority: SCRIBE_HINTS_PERSISTENCE_TTL > config
Default: 7 days
| Property | Value |
|---|---|
| Default | 7 days |
| Override | SCRIBE_HINTS_PERSISTENCE_TTL (optional) |
monitoring.hints.persistence.ttl = 7 daysmonitoring.hints.persistence.ttl = ${?SCRIBE_HINTS_PERSISTENCE_TTL}monitoring.http
Section titled “monitoring.http”HTTP server settings for monitoring endpoints.
This section inherits all settings from http. Override individual settings as needed.
monitoring.log
Section titled “monitoring.log”Wide Log Operations
Accumulates context throughout request/task execution and emits a single structured log line when the operation is “interesting” (errors, warnings, slow). Wide events are emitted at INFO/WARN/ERROR level with the format: scribe.log {“trace_id”:”…”,“span_id”:”…”,“duration”:“PT0.123S”,…} Fields always included: duration (ISO-8601), result (ok|<failure_kind>) Optional fields: trace_id, span_id, parent_span_id (when tracing enabled) failure (kind, code, details), warnings, events
monitoring.log.childs
Section titled “monitoring.log.childs”Child Segment Tracking
Records child segments as events in wide logs with offset/duration timing. Useful for understanding operation breakdown and identifying slow sub-operations.
monitoring.log.childs.display
Section titled “monitoring.log.childs.display”Display mode for segment events in PRETTY format. Modes: auto - summary in dev mode, off in prod (default) full - Show all segment events with timing summary - Show counts by name + top 5 slowest segments off - Don’t show segment events (still shows warnings and other events)
Priority: SCRIBE_LOG_CHILDS_DISPLAY > config
| Property | Value |
|---|---|
| Default | "auto" |
| Override | SCRIBE_LOG_CHILDS_DISPLAY (optional) |
monitoring.log.childs.display = "auto"monitoring.log.childs.display = ${?SCRIBE_LOG_CHILDS_DISPLAY}monitoring.log.childs.mode
Section titled “monitoring.log.childs.mode”Mode: off | minimal | full | auto
auto - Full in dev mode, minimal in prod (default) off - Don’t track segments minimal - Track name, offset, duration only full - Track name, offset, duration, and segment attributes AUTO behavior uses app.mode to determine environment:
- Dev mode (app.mode=dev/development/local/test) → full
- Production mode → minimal
Priority: SCRIBE_LOG_CHILDS_MODE > config
| Property | Value |
|---|---|
| Default | "auto" |
| Override | SCRIBE_LOG_CHILDS_MODE (optional) |
monitoring.log.childs.mode = "auto"monitoring.log.childs.mode = ${?SCRIBE_LOG_CHILDS_MODE}monitoring.log.childs.rules
Section titled “monitoring.log.childs.rules”Rules for which segments to track. First matching rule wins. Empty list = track all segments. Rule syntax (same as log.rules): { action = include|exclude, name = “glob”, where = “filter” }
Example: Filter sub-millisecond segments:
rules = [ { action = exclude, where = "duration.seconds<1ms" } ]| Property | Value |
|---|---|
| Default | [##, Default:, ...] |
monitoring.log.childs.rules = [ ## Default: filter sub-millisecond noise in production ## Exclude segments < 1ms { action = exclude, where = "duration.seconds<1ms" } ]monitoring.log.enabled
Section titled “monitoring.log.enabled”Enable wide-event logging.
When disabled, no wide logs are emitted.
| Property | Value |
|---|---|
| Default | true |
| Override | SCRIBE_LOG_ENABLED (optional) |
monitoring.log.enabled = truemonitoring.log.enabled = ${?SCRIBE_LOG_ENABLED}monitoring.log.format
Section titled “monitoring.log.format”Log format for wide event emission.
Supported formats:
pretty - Human-friendly multi-line format with auto-grouped attributes, segment timeline, and color support (TTY only) (default) json - Single-line JSON payload (machine-parseable) auto - Pretty in dev mode + TTY, JSON otherwise AUTO behavior uses app.mode to determine environment:
- Dev mode (app.mode=dev/development/local/test) + TTY → pretty with colors
- Everything else → json
| Property | Value |
|---|---|
| Default | "pretty" |
| Override | SCRIBE_LOG_FORMAT (optional) |
monitoring.log.format = "pretty"monitoring.log.format = ${?SCRIBE_LOG_FORMAT}monitoring.log.leak-detection
Section titled “monitoring.log.leak-detection”Leak Detection
Detects segments that remain open longer than expected, indicating potential resource leaks or forgotten end() calls. When triggered, logs a warning with the creation stack trace for debugging.
monitoring.log.leak-detection.mode
Section titled “monitoring.log.leak-detection.mode”Mode: off | on | auto
off - Never run leak detection on - Always run leak detection auto - Enabled when app.mode is dev/development/local/test; disabled otherwise
Priority: SCRIBE_LOG_LEAK_DETECTION > config
| Property | Value |
|---|---|
| Default | auto |
| Override | SCRIBE_LOG_LEAK_DETECTION (optional) |
monitoring.log.leak-detection.mode = automonitoring.log.leak-detection.mode = ${?SCRIBE_LOG_LEAK_DETECTION}monitoring.log.leak-detection.rules
Section titled “monitoring.log.leak-detection.rules”Rules for leak detection - first match wins. Use name patterns to exclude known long-running operations. No matching rule = include (emit leak warning).
Examples:
- Exclude long-running workers: { action = exclude, name = “Transcription.*” }
- Exclude background jobs: { action = exclude, name = “Background.*” }
| Property | Value |
|---|---|
| Default | [##, MCP, ...] |
monitoring.log.leak-detection.rules = [ ## MCP uses SSE streaming - connections legitimately stay open indefinitely { action = exclude, name = "HTTP * /mcp" } { action = exclude, name = "HTTP * /observe/mcp" }
## Async LDAP searches during sync can run indefinitely - not leaks { action = exclude, name = "Ingest.AsyncSearch.*" }
## Reconciliation can take minutes for large directories - not a leak { action = exclude, name = "Ingest.Reconciliation" }
## Don't warn for segments that haven't exceeded the global threshold { action = exclude, where = "leak.duration.seconds<=90s" } ]monitoring.log.leak-detection.threshold
Section titled “monitoring.log.leak-detection.threshold”Global minimum threshold for leak detection. Segments must be open at least this long to be considered potential leaks. Rules below can add stricter thresholds for specific segment types.
Priority: SCRIBE_LOG_LEAK_DETECTION_THRESHOLD > config
| Property | Value |
|---|---|
| Default | 90s |
| Override | SCRIBE_LOG_LEAK_DETECTION_THRESHOLD (optional) |
monitoring.log.leak-detection.threshold = 90smonitoring.log.leak-detection.threshold = ${?SCRIBE_LOG_LEAK_DETECTION_THRESHOLD}monitoring.log.redaction
Section titled “monitoring.log.redaction”Redaction
monitoring.log.rules
Section titled “monitoring.log.rules”Emission Rules
Unified rules for controlling which operations are logged. Rules are evaluated in order - first matching rule wins.
Decision Flow:
flowchart TD
A[Operation Ends] --> B{Failure or Warning?}
B -->|Yes| LOG[Log]
B -->|No| C{markInteresting?}
C -->|Yes| LOG
C -->|No| D{First matching rule?}
D -->|include| LOG
D -->|exclude| SUPPRESS[Suppress]
D -->|none| G{Sample rate?}
G -->|Pass| LOG
G -->|Fail| SUPPRESS
Rule syntax:
{ action = include|exclude, name = “glob”, where = “filter” } Fields:
- action (required): include (log the operation) or exclude (suppress it)
- name (optional): glob pattern for operation name (e.g., “LDAP.”, ”.ShutDown”)
- where (optional): filter on attributes. Supports FleX, LDAP, SCIM, and JSON:
FleX: “scribe.result=ok” (preferred - cleaner syntax)
LDAP: “(scribe.result=ok)”
Operation names follow the format {Channel}.{Operation}:
| Channel | Operations |
|---|---|
| LDAP | Search, Bind, Compare, Modify, Add, Delete |
| REST | Search, Modify, Add, Delete |
| GraphQL | Search, Modify, Add, Delete |
| GRPC | Search, Modify, Add, Delete |
Duration attributes (ending in .seconds) support multiple formats:
- Plain seconds: 0.1, 90, 5.5
- HOCON style: 100ms, 5s, 1m, 2h
- ISO 8601: PT90S, PT1M30S, PT1H
Common examples (FleX is the runtime query language; also works in config):
- Log all LDAP operations: { action = include, name = “LDAP.*” }
- Suppress fast successful ops: { action = exclude, where = “scribe.result=ok duration.seconds<=50ms” }
- Log slow operations: { action = include, where = “duration.seconds >= 5s” }
- Log errors or slow: { action = include, where = “scribe.result != ok or duration.seconds >= 5s” }
- Using natural language: { action = exclude, where = “scribe.result is ok and duration.seconds is at most 50ms” }
| Property | Value |
|---|---|
| Default | [##, Suppress, ...] |
monitoring.log.rules = [ ## Suppress successful LDAP operations under 500ms (higher threshold for DB-backed searches) { action = exclude, name = "LDAP.*", where = "scribe.result=ok duration.seconds<=500ms" }
## Suppress fast successful operations (primary use case for noise reduction) { action = exclude, where = "scribe.result=ok duration.seconds<=50ms" }
## Suppress fast client errors (4xx) - expected from chaos/load testing ## These are client mistakes, not server issues; only log if slow (potential perf issue) { action = exclude, where = "failure.kind=INVALID_ARGUMENT duration.seconds<=100ms" } { action = exclude, where = "failure.kind=NOT_FOUND duration.seconds<=100ms" }
## Default noise gates for internal operations { action = exclude, name = "Transcription.WorkItem", where = "duration.seconds<=500ms" } { action = exclude, name = "Hints.ExplainSampling", where = "duration.seconds<=10s" } { action = exclude, name = "Hints.*", where = "duration.seconds<=100ms" } { action = exclude, name = "Metrics.*", where = "duration.seconds<=500ms" } { action = exclude, name = "*.ShutDown", where = "duration.seconds<=5s" } { action = exclude, name = "*.StartUp", where = "duration.seconds<=5s" } { action = exclude, name = "Ingest.AsyncSearch.*", where = "duration.seconds<=5s" } ]monitoring.log.sample-rate
Section titled “monitoring.log.sample-rate”Random sampling for operations that don’t match any rule. 0 = never log, 100 = always log (default)
| Property | Value |
|---|---|
| Default | 100 |
| Override | SCRIBE_LOG_SAMPLE_RATE (optional) |
monitoring.log.sample-rate = 100monitoring.log.sample-rate = ${?SCRIBE_LOG_SAMPLE_RATE}monitoring.log.shutdown-noise
Section titled “monitoring.log.shutdown-noise”Shutdown Noise Suppression
Controls whether expected shutdown-time errors (connection refused, interrupts, cancellation, pool closed) are downgraded from ERROR to DEBUG.
Priority: SCRIBE_LOG_SHUTDOWN_NOISE > config
monitoring.log.shutdown-noise.mode
Section titled “monitoring.log.shutdown-noise.mode”Mode: off | on | auto
off - Never suppress; always log as ERROR (useful for debugging shutdown issues) on - Suppress known shutdown noise during shutdown phases auto - Suppress in test and production modes; show in development (default) AUTO behavior uses app.mode to determine environment:
- Dev mode (app.mode=dev/development/local) → off (show errors for debugging)
- Test mode (app.mode=test) → on (quiet test output)
- Production mode → on (suppress expected teardown noise)
| Property | Value |
|---|---|
| Default | auto |
| Override | SCRIBE_LOG_SHUTDOWN_NOISE (optional) |
monitoring.log.shutdown-noise.mode = automonitoring.log.shutdown-noise.mode = ${?SCRIBE_LOG_SHUTDOWN_NOISE}monitoring.mcp
Section titled “monitoring.mcp”MCP (Model Context Protocol) observe channel
Exposes operational insights to AI coding assistants via:
- Tools: status, health, doctor, signals, channels, stats
- Prompts: observe guide, troubleshooting-ops
- Resources: observe OpenAPI spec, resolved config
The endpoint is fixed at /observe/mcp.
monitoring.mcp.enabled
Section titled “monitoring.mcp.enabled”Enable the MCP observe channel
Priority: SCRIBE_MONITORING_MCP_ENABLED > config
| Property | Value |
|---|---|
| Default | false |
| Override | SCRIBE_MONITORING_MCP_ENABLED (optional) |
monitoring.mcp.enabled = falsemonitoring.mcp.enabled = ${?SCRIBE_MONITORING_MCP_ENABLED}monitoring.mcp.socket
Section titled “monitoring.mcp.socket”Socket reference(s) for /observe/mcp endpoint. Can be a single name, comma-separated list, or HOCON list. Falls back to monitoring.socket if unset.
Priority: SCRIBE_MONITORING_MCP_SOCKET > config
Examples:
socket = "admin" # Single socketsocket = "admin, public" # Comma-separatedsocket = ["admin", "public"] # HOCON list| Property | Value |
|---|---|
| Override | SCRIBE_MONITORING_MCP_SOCKET (optional) |
monitoring.mcp.socket = ${?SCRIBE_MONITORING_MCP_SOCKET}monitoring.metrics
Section titled “monitoring.metrics”Metrics configuration (OTLP push, independent of Prometheus scrape)
monitoring.metrics.enabled
Section titled “monitoring.metrics.enabled”Enable OTLP metrics push export
Default: false (Prometheus at /metrics is the primary export)
Priority: SCRIBE_METRICS_ENABLED > config
| Property | Value |
|---|---|
| Default | false |
| Override | SCRIBE_METRICS_ENABLED (optional) |
monitoring.metrics.enabled = falsemonitoring.metrics.enabled = ${?SCRIBE_METRICS_ENABLED}monitoring.metrics.endpoint
Section titled “monitoring.metrics.endpoint”OTLP endpoint for metrics
Priority: SCRIBE_METRICS_ENDPOINT > OTEL_EXPORTER_OTLP_METRICS_ENDPOINT > OTEL_EXPORTER_OTLP_ENDPOINT > config
| Property | Value |
|---|---|
| Default | "http://localhost:4317" |
| Override | SCRIBE_METRICS_ENDPOINT (optional) > OTEL_EXPORTER_OTLP_METRICS_ENDPOINT (standard) > OTEL_EXPORTER_OTLP_ENDPOINT (standard) |
monitoring.metrics.endpoint = "http://localhost:4317"monitoring.metrics.endpoint = ${?OTEL_EXPORTER_OTLP_ENDPOINT}monitoring.metrics.endpoint = ${?OTEL_EXPORTER_OTLP_METRICS_ENDPOINT}monitoring.metrics.endpoint = ${?SCRIBE_METRICS_ENDPOINT}monitoring.metrics.interval
Section titled “monitoring.metrics.interval”Export interval for OTLP push
Priority: SCRIBE_METRICS_INTERVAL > config
| Property | Value |
|---|---|
| Default | 60 seconds |
| Override | SCRIBE_METRICS_INTERVAL (optional) |
monitoring.metrics.interval = 60 secondsmonitoring.metrics.interval = ${?SCRIBE_METRICS_INTERVAL}monitoring.metrics.protocol
Section titled “monitoring.metrics.protocol”- When protocol is “http/protobuf” and the endpoint still uses the default gRPC port (4317),
Identity Scribe will automatically rewrite it to port 4318.
- For “http/protobuf”, if the endpoint does not already include “/v1/metrics”, it will be appended.
Protocol: “grpc” (port 4317) or “http/protobuf” (port 4318)
Priority: SCRIBE_METRICS_PROTOCOL > OTEL_EXPORTER_OTLP_METRICS_PROTOCOL > OTEL_EXPORTER_OTLP_PROTOCOL > config
| Property | Value |
|---|---|
| Default | "grpc" |
| Override | SCRIBE_METRICS_PROTOCOL (optional) > OTEL_EXPORTER_OTLP_METRICS_PROTOCOL (standard) > OTEL_EXPORTER_OTLP_PROTOCOL (standard) |
monitoring.metrics.protocol = "grpc"monitoring.metrics.protocol = ${?OTEL_EXPORTER_OTLP_PROTOCOL}monitoring.metrics.protocol = ${?OTEL_EXPORTER_OTLP_METRICS_PROTOCOL}monitoring.metrics.protocol = ${?SCRIBE_METRICS_PROTOCOL}monitoring.observe
Section titled “monitoring.observe”Observe endpoints (/observe, /observe/*) Provides status, doctor, pressure, indexes, hints, signatures, stats, etc.
monitoring.observe.enabled
Section titled “monitoring.observe.enabled”Enable /observe/* endpoints.
Priority: SCRIBE_OBSERVE_ENABLED > config
| Property | Value |
|---|---|
| Default | true |
| Override | SCRIBE_OBSERVE_ENABLED (optional) |
monitoring.observe.enabled = truemonitoring.observe.enabled = ${?SCRIBE_OBSERVE_ENABLED}monitoring.observe.socket
Section titled “monitoring.observe.socket”Socket for /observe/* endpoints. Falls back to monitoring.socket if unset.
Priority: SCRIBE_OBSERVE_SOCKET > config
| Property | Value |
|---|---|
| Override | SCRIBE_OBSERVE_SOCKET (optional) |
monitoring.observe.socket = ${?SCRIBE_OBSERVE_SOCKET}monitoring.prometheus
Section titled “monitoring.prometheus”Prometheus scrape endpoint (/metrics)
monitoring.prometheus.enabled
Section titled “monitoring.prometheus.enabled”Enable Prometheus scrape endpoint at /metrics
Priority: SCRIBE_PROMETHEUS_ENABLED > config
| Property | Value |
|---|---|
| Default | true |
| Override | SCRIBE_PROMETHEUS_ENABLED (optional) |
monitoring.prometheus.enabled = truemonitoring.prometheus.enabled = ${?SCRIBE_PROMETHEUS_ENABLED}monitoring.prometheus.scrape-interval
Section titled “monitoring.prometheus.scrape-interval”The interval at which metrics are scraped internally. Background scraping decouples collection latency from HTTP response time. Common Prometheus scrape interval is 60 seconds (default). Default is 30 seconds (1/2 of Prometheus default).
Priority: SCRIBE_PROMETHEUS_SCRAPE_INTERVAL > config
| Property | Value |
|---|---|
| Default | 30 seconds |
| Override | SCRIBE_PROMETHEUS_SCRAPE_INTERVAL (optional) |
monitoring.prometheus.scrape-interval = 30 secondsmonitoring.prometheus.scrape-interval = ${?SCRIBE_PROMETHEUS_SCRAPE_INTERVAL}monitoring.prometheus.socket
Section titled “monitoring.prometheus.socket”Socket for /metrics endpoint. Falls back to monitoring.socket if unset.
Priority: SCRIBE_PROMETHEUS_SOCKET > config
| Property | Value |
|---|---|
| Override | SCRIBE_PROMETHEUS_SOCKET (optional) |
monitoring.prometheus.socket = ${?SCRIBE_PROMETHEUS_SOCKET}monitoring.resource
Section titled “monitoring.resource”OpenTelemetry (OTel) configuration
Environment variable overrides (highest priority wins):
- Priority: SCRIBE_* > OTEL_* > config file > default
- HOCON substitution: later values override earlier ones
Enablement semantics:
- OTel SDK is automatically enabled when any metric export is active:
monitoring.prometheus.enabled=true OR monitoring.metrics.enabled=true- OTEL_SDK_DISABLED=true disables OTLP exporters but allows Prometheus pull
- To disable tracing: set monitoring.traces.enabled=false
Resource attributes for all telemetry
monitoring.resource.environment
Section titled “monitoring.resource.environment”Deployment environment (e.g., “production”, “staging”, “dev”)
Priority: SCRIBE_RESOURCE_ENVIRONMENT > OTEL_DEPLOYMENT_ENVIRONMENT > config
| Property | Value |
|---|---|
| Default | null |
| Override | SCRIBE_RESOURCE_ENVIRONMENT (optional) > OTEL_DEPLOYMENT_ENVIRONMENT (standard) |
monitoring.resource.environment = ${?OTEL_DEPLOYMENT_ENVIRONMENT}monitoring.resource.environment = ${?SCRIBE_RESOURCE_ENVIRONMENT}monitoring.resource.service-name
Section titled “monitoring.resource.service-name”Service name
Priority: SCRIBE_RESOURCE_SERVICE_NAME > OTEL_SERVICE_NAME > config
| Property | Value |
|---|---|
| Default | "identity-scribe" |
| Override | SCRIBE_RESOURCE_SERVICE_NAME (optional) > OTEL_SERVICE_NAME (standard) |
monitoring.resource.service-name = "identity-scribe"monitoring.resource.service-name = ${?OTEL_SERVICE_NAME}monitoring.resource.service-name = ${?SCRIBE_RESOURCE_SERVICE_NAME}monitoring.resource.service-version
Section titled “monitoring.resource.service-version”Service version (auto-detected from Implementation-Version if unset)
Priority: SCRIBE_RESOURCE_SERVICE_VERSION > OTEL_SERVICE_VERSION > config
| Property | Value |
|---|---|
| Default | null |
| Override | SCRIBE_RESOURCE_SERVICE_VERSION (optional) > OTEL_SERVICE_VERSION (standard) |
monitoring.resource.service-version = ${?OTEL_SERVICE_VERSION}monitoring.resource.service-version = ${?SCRIBE_RESOURCE_SERVICE_VERSION}monitoring.socket
Section titled “monitoring.socket”Socket reference for monitoring endpoints. Use a named socket from http.sockets.* or omit for @default. If unset (no env var, key omitted, or value is null), defaults to @default.
Priority: SCRIBE_MONITORING_SOCKET > config
| Property | Value |
|---|---|
| Override | SCRIBE_MONITORING_SOCKET (optional) |
monitoring.socket = ${?SCRIBE_MONITORING_SOCKET}monitoring.traces
Section titled “monitoring.traces”Trace exporter configuration
monitoring.traces.enabled
Section titled “monitoring.traces.enabled”Enable trace export
Default: false (opt-in for tracing)
Priority: SCRIBE_TRACES_ENABLED > config
| Property | Value |
|---|---|
| Default | false |
| Override | SCRIBE_TRACES_ENABLED (optional) |
monitoring.traces.enabled = falsemonitoring.traces.enabled = ${?SCRIBE_TRACES_ENABLED}monitoring.traces.endpoint
Section titled “monitoring.traces.endpoint”OTLP endpoint for traces
Priority: SCRIBE_TRACES_ENDPOINT > OTEL_EXPORTER_OTLP_TRACES_ENDPOINT > OTEL_EXPORTER_OTLP_ENDPOINT > config
| Property | Value |
|---|---|
| Default | "http://localhost:4317" |
| Override | SCRIBE_TRACES_ENDPOINT (optional) > OTEL_EXPORTER_OTLP_TRACES_ENDPOINT (standard) > OTEL_EXPORTER_OTLP_ENDPOINT (standard) |
monitoring.traces.endpoint = "http://localhost:4317"monitoring.traces.endpoint = ${?OTEL_EXPORTER_OTLP_ENDPOINT}monitoring.traces.endpoint = ${?OTEL_EXPORTER_OTLP_TRACES_ENDPOINT}monitoring.traces.endpoint = ${?SCRIBE_TRACES_ENDPOINT}monitoring.traces.protocol
Section titled “monitoring.traces.protocol”- When protocol is “http/protobuf” and the endpoint still uses the default gRPC port (4317),
Identity Scribe will automatically rewrite it to port 4318.
- For “http/protobuf”, if the endpoint does not already include “/v1/traces”, it will be appended.
Protocol: “grpc” (port 4317) or “http/protobuf” (port 4318)
Priority: SCRIBE_TRACES_PROTOCOL > OTEL_EXPORTER_OTLP_TRACES_PROTOCOL > OTEL_EXPORTER_OTLP_PROTOCOL > config
| Property | Value |
|---|---|
| Default | "grpc" |
| Override | SCRIBE_TRACES_PROTOCOL (optional) > OTEL_EXPORTER_OTLP_TRACES_PROTOCOL (standard) > OTEL_EXPORTER_OTLP_PROTOCOL (standard) |
monitoring.traces.protocol = "grpc"monitoring.traces.protocol = ${?OTEL_EXPORTER_OTLP_PROTOCOL}monitoring.traces.protocol = ${?OTEL_EXPORTER_OTLP_TRACES_PROTOCOL}monitoring.traces.protocol = ${?SCRIBE_TRACES_PROTOCOL}monitoring.traces.timeout
Section titled “monitoring.traces.timeout”Export timeout
Priority: SCRIBE_TRACES_TIMEOUT > OTEL_EXPORTER_OTLP_TRACES_TIMEOUT > OTEL_EXPORTER_OTLP_TIMEOUT > config
| Property | Value |
|---|---|
| Default | 10 seconds |
| Override | SCRIBE_TRACES_TIMEOUT (optional) > OTEL_EXPORTER_OTLP_TRACES_TIMEOUT (standard) > OTEL_EXPORTER_OTLP_TIMEOUT (standard) |
monitoring.traces.timeout = 10 secondsmonitoring.traces.timeout = ${?OTEL_EXPORTER_OTLP_TIMEOUT}monitoring.traces.timeout = ${?OTEL_EXPORTER_OTLP_TRACES_TIMEOUT}monitoring.traces.timeout = ${?SCRIBE_TRACES_TIMEOUT}