Skip to content

Monitoring

Monitoring endpoints and telemetry configuration. Includes Prometheus metrics, health probes, and OpenTelemetry tracing.

Enable the monitoring server.

PropertyValue
Defaulttrue
monitoring.enabled = true

Exemplars (metrics → traces) for histogram metrics. Exemplars attach trace context (traceId, spanId) to histogram buckets, enabling “click to trace” in Grafana when using Prometheus + Tempo. Values: off - No exemplars recorded (default, minimal overhead) traceBased - Record exemplars only when an active sampled span exists alwaysOn - Record all measurements as exemplar candidates IMPORTANT:

  • traceBased requires tracing to be enabled (monitoring.traces.enabled=true)
  • Exemplars require a metrics exporter to be enabled:
monitoring.prometheus.enabled=true OR monitoring.metrics.enabled=true

External requirements (operator responsibility):

  • Prometheus: —enable-feature=exemplar-storage
  • Grafana: Configure Tempo datasource with exemplar linking

Priority: SCRIBE_METRICS_EXEMPLARS > OTEL_METRICS_EXEMPLAR_FILTER > config

PropertyValue
Default"off"
OverrideSCRIBE_METRICS_EXEMPLARS (optional) > OTEL_METRICS_EXEMPLAR_FILTER (standard)
monitoring.exemplars = "off"
monitoring.exemplars = ${?OTEL_METRICS_EXEMPLAR_FILTER}
monitoring.exemplars = ${?SCRIBE_METRICS_EXEMPLARS}

Health probe endpoints (/livez, /readyz, /startedz, /healthz) Kubernetes-style health probes for liveness, readiness, and startup checks.

Enable health probe endpoints.

Priority: SCRIBE_HEALTH_ENABLED > config

PropertyValue
Defaulttrue
OverrideSCRIBE_HEALTH_ENABLED (optional)
monitoring.health.enabled = true
monitoring.health.enabled = ${?SCRIBE_HEALTH_ENABLED}

Socket for health probe endpoints. Falls back to monitoring.socket if unset.

Priority: SCRIBE_HEALTH_SOCKET > config

PropertyValue
OverrideSCRIBE_HEALTH_SOCKET (optional)
monitoring.health.socket = ${?SCRIBE_HEALTH_SOCKET}

Configuration for the operational hint engine. The hint engine detects performance issues and optimization opportunities during LDAP search execution, surfacing actionable recommendations via metrics and structured logs.

Detector modules:

  • Equality/Range Coverage: Detects missing valueMatch indexes
  • Sort/VLV Readiness: Detects missing sortable indexes and non-deterministic ordering
  • Control Fallback: Detects when LDAP controls force delegation or rewrites
  • Type Scope Guidance: Recommends explicit entryType constraints for better performance
  • Startup Validation: Validates indices configuration at application startup

Query signature cache configuration. Controls TTL and size for the query signature cache used by EXPLAIN sampling. This cache stores execution counts, last EXPLAIN timestamps, sequential-scan flags, and SQL text needed for future analysis.

Maximum number of cached query signatures. When cache is full, least-recently-used entries are evicted.

Priority: SCRIBE_HINTS_CACHE_SIZE > config

Default: 500

PropertyValue
Default500
OverrideSCRIBE_HINTS_CACHE_SIZE (optional)
monitoring.hints.cache.size = 500
monitoring.hints.cache.size = ${?SCRIBE_HINTS_CACHE_SIZE}

Time-to-live for signature cache entries. Signatures older than this TTL will be re-EXPLAINed on next execution. Cache uses expire-after-access, so hot signatures stay longer.

Priority: SCRIBE_HINTS_CACHE_TTL > config

Default: 1 hour

PropertyValue
Default1 hour
OverrideSCRIBE_HINTS_CACHE_TTL (optional)
monitoring.hints.cache.ttl = 1 hour
monitoring.hints.cache.ttl = ${?SCRIBE_HINTS_CACHE_TTL}

Per-detector module toggles.

Each detector can be individually enabled or disabled. Defaults to enabled when monitoring.hints.enabled is true.

monitoring.hints.detectors.config-validation

Section titled “monitoring.hints.detectors.config-validation”

Enable/disable startup configuration validation. Validates that all indexed attributes exist in observed attributes and checks virtual attributes for entryType in filters. Emits CONFIG hints for misconfigurations at startup.

PropertyValue
Defaulttrue
monitoring.hints.detectors.config-validation = true

monitoring.hints.detectors.control-fallback

Section titled “monitoring.hints.detectors.control-fallback”

Enable/disable control fallback detection. Emits hints when LDAP controls (e.g., VLV assertion values, unsupported sort attributes) force delegation or in-memory rewrites.

PropertyValue
Defaulttrue
monitoring.hints.detectors.control-fallback = true

Enable/disable equality and range index coverage detection. Emits hints when equality (=), greaterOrEqual (>=), or lessOrEqual (<=) filters are used on attributes missing value-match (truncation) indexes.

PropertyValue
Defaulttrue
monitoring.hints.detectors.equality-index = true

Enable/disable sort and VLV readiness detection. Emits hints when sortable indexes are missing or sort order is non-deterministic (multi-valued attributes without tie-breaker).

PropertyValue
Defaulttrue
monitoring.hints.detectors.sort-plan = true

monitoring.hints.detectors.type-scope-guidance

Section titled “monitoring.hints.detectors.type-scope-guidance”

Enable/disable type scope guidance. Emits informational hints recommending explicit (entryType=X) constraints when types are known from DN context but filters lack entryType.

PropertyValue
Defaulttrue
monitoring.hints.detectors.type-scope-guidance = true

Enable or disable hint collection and emission. When disabled, no hints are collected and no metrics are incremented.

Priority: SCRIBE_HINTS_ENABLED > config

PropertyValue
Defaulttrue
OverrideSCRIBE_HINTS_ENABLED (optional)
monitoring.hints.enabled = true
monitoring.hints.enabled = ${?SCRIBE_HINTS_ENABLED}

EXPLAIN-based runtime analysis and query signature tracking. Enables signature-based EXPLAIN sampling to detect sequential scans and correlate query patterns with performance issues.

Enable or disable EXPLAIN sampling and query signature tracking. When disabled, no signatures are computed, no EXPLAIN plans are analyzed, and /observe/signatures endpoint is unavailable. Defaults to enabled when monitoring.hints.enabled is true.

Priority: SCRIBE_HINTS_EXPLAIN_ENABLED > config

PropertyValue
OverrideSCRIBE_HINTS_EXPLAIN_ENABLED (optional)
monitoring.hints.explain.enabled = ${?SCRIBE_HINTS_EXPLAIN_ENABLED}

EXPLAIN command options for enhanced plan analysis. Options control what information is included in EXPLAIN output:

  • ANALYZE: Execute query and show actual runtime statistics (actual rows, actual time)
  • COSTS: Include estimated costs (enabled by default in PostgreSQL)
  • VERBOSE: Include additional details (output columns, schema-qualified names)
  • BUFFERS: Include buffer usage statistics (requires ANALYZE)

Default: [] (empty, uses default PostgreSQL EXPLAIN behavior)

Example: [“ANALYZE”, “BUFFERS”] for runtime stats and buffer usage

PropertyValue
Default[]
monitoring.hints.explain.options = []

Re-EXPLAIN every N query executions for the same signature. Used to detect plan drift and refresh analysis for frequently-executed queries.

Priority: SCRIBE_HINTS_EXPLAIN_SAMPLE_EVERY > config

Default: 1000

PropertyValue
Default1000
OverrideSCRIBE_HINTS_EXPLAIN_SAMPLE_EVERY (optional)
monitoring.hints.explain.sample-every = 1000
monitoring.hints.explain.sample-every = ${?SCRIBE_HINTS_EXPLAIN_SAMPLE_EVERY}

monitoring.hints.explain.slow-query-duration

Section titled “monitoring.hints.explain.slow-query-duration”

Slow query threshold duration. Queries exceeding this duration trigger high-priority EXPLAIN sampling (non-blocking, background execution). This value also controls fast-query hint filtering: sequential scan hints without actionable index recommendations are suppressed when execution time is below sqrt(slow-query-duration) (~31ms at default 1s). This reduces noise from fast queries on small tables that don’t benefit from indexing.

slow-query-durationFast hint threshold
100ms~10ms
500ms~22ms
1000ms (default)~31ms
5000ms~70ms

Priority: SCRIBE_HINTS_EXPLAIN_SLOW_QUERY_DURATION > config

Default: 1000ms

PropertyValue
Default1000ms
OverrideSCRIBE_HINTS_EXPLAIN_SLOW_QUERY_DURATION (optional)
monitoring.hints.explain.slow-query-duration = 1000ms
monitoring.hints.explain.slow-query-duration = ${?SCRIBE_HINTS_EXPLAIN_SLOW_QUERY_DURATION}

Hint persistence configuration.

Persists hints to PostgreSQL for audit and Insights visualization. When enabled, both hint signals and query signatures are persisted. Query signatures store the full normalized query structure (filter, sort, scope, types, controls) for debugging and correlation with hints.

monitoring.hints.persistence.batch-interval

Section titled “monitoring.hints.persistence.batch-interval”

Maximum wait time between batch flushes. Batches are flushed when full or when this interval elapses.

Priority: SCRIBE_HINTS_PERSISTENCE_BATCH_INTERVAL > config

Default: 5 seconds

PropertyValue
Default5 seconds
OverrideSCRIBE_HINTS_PERSISTENCE_BATCH_INTERVAL (optional)
monitoring.hints.persistence.batch-interval = 5 seconds
monitoring.hints.persistence.batch-interval = ${?SCRIBE_HINTS_PERSISTENCE_BATCH_INTERVAL}

Batch size for database inserts. Hints and signatures are batched before writing to reduce database load.

Priority: SCRIBE_HINTS_PERSISTENCE_BATCH_SIZE > config

Default: 100

PropertyValue
Default100
OverrideSCRIBE_HINTS_PERSISTENCE_BATCH_SIZE (optional)
monitoring.hints.persistence.batch-size = 100
monitoring.hints.persistence.batch-size = ${?SCRIBE_HINTS_PERSISTENCE_BATCH_SIZE}

monitoring.hints.persistence.clear-on-startup

Section titled “monitoring.hints.persistence.clear-on-startup”

Clear persisted hints on startup. When true, truncates hint_signals and hint_signatures tables at startup. Ensures hints reflect current configuration rather than stale state from previous runs. Useful when configuration changes (e.g., new indexes) make old hints obsolete.

Priority: SCRIBE_HINTS_PERSISTENCE_CLEAR_ON_STARTUP > config

Default: true

PropertyValue
Defaulttrue
OverrideSCRIBE_HINTS_PERSISTENCE_CLEAR_ON_STARTUP (optional)
monitoring.hints.persistence.clear-on-startup = true
monitoring.hints.persistence.clear-on-startup = ${?SCRIBE_HINTS_PERSISTENCE_CLEAR_ON_STARTUP}

Enable or disable hint persistence. When disabled, hints are only emitted via metrics and logs.

Priority: SCRIBE_HINTS_PERSISTENCE_ENABLED > config

Default: false (opt-in)

PropertyValue
Defaultfalse
OverrideSCRIBE_HINTS_PERSISTENCE_ENABLED (optional)
monitoring.hints.persistence.enabled = false
monitoring.hints.persistence.enabled = ${?SCRIBE_HINTS_PERSISTENCE_ENABLED}

Maximum size of the in-memory persistence queue. When queue is full, newest hints are dropped (FIFO eviction).

Priority: SCRIBE_HINTS_PERSISTENCE_MAX_QUEUE > config

Default: 5000

PropertyValue
Default5000
OverrideSCRIBE_HINTS_PERSISTENCE_MAX_QUEUE (optional)
monitoring.hints.persistence.max-queue = 5000
monitoring.hints.persistence.max-queue = ${?SCRIBE_HINTS_PERSISTENCE_MAX_QUEUE}

Maximum number of hint rows to retain in the database (ring-buffer). When this limit is reached, oldest hints are evicted to make room for new ones.

Priority: SCRIBE_HINTS_PERSISTENCE_MAX_ROWS > config

Default: 50000

PropertyValue
Default50000
OverrideSCRIBE_HINTS_PERSISTENCE_MAX_ROWS (optional)
monitoring.hints.persistence.max-rows = 50000
monitoring.hints.persistence.max-rows = ${?SCRIBE_HINTS_PERSISTENCE_MAX_ROWS}

Rules for filtering which hints are persisted. Rules are evaluated in order - first matching rule wins. Hints that match an “exclude” rule are still emitted via metrics and logs, but not stored in the database. Rule syntax (same as log.rules): { action = include|exclude, where = “filter” } Available attributes for filtering:

  • hint-type: PARTIAL_MATCH_INDEX, EQUALITY_INDEX, SORT_PLAN,

CONTROL_FALLBACK, EXPLAIN_SEQ_SCAN, TYPE_SCOPE_GUIDANCE, CONFIG

  • severity: INFO, WARNING, ERROR
  • attribute: the attribute name (e.g., “cn”, “mail”), null if not applicable
  • entry-type: first entry type if multiple (e.g., “inetOrgPerson”), null if none
  • base-dn: the base DN of the query (e.g., “ou=users,dc=example,dc=com”), null if none

Examples:

Default: [] (empty - persist all hints)

PropertyValue
Default[]
monitoring.hints.persistence.rules = []

Time-to-live for persisted hints and signatures. Hints and signatures older than this TTL are automatically purged.

Priority: SCRIBE_HINTS_PERSISTENCE_TTL > config

Default: 7 days

PropertyValue
Default7 days
OverrideSCRIBE_HINTS_PERSISTENCE_TTL (optional)
monitoring.hints.persistence.ttl = 7 days
monitoring.hints.persistence.ttl = ${?SCRIBE_HINTS_PERSISTENCE_TTL}

HTTP server settings for monitoring endpoints.

This section inherits all settings from http. Override individual settings as needed.


Wide Log Operations

Accumulates context throughout request/task execution and emits a single structured log line when the operation is “interesting” (errors, warnings, slow). Wide events are emitted at INFO/WARN/ERROR level with the format: scribe.log {“trace_id”:”…”,“span_id”:”…”,“duration”:“PT0.123S”,…} Fields always included: duration (ISO-8601), result (ok|<failure_kind>) Optional fields: trace_id, span_id, parent_span_id (when tracing enabled) failure (kind, code, details), warnings, events

Child Segment Tracking

Records child segments as events in wide logs with offset/duration timing. Useful for understanding operation breakdown and identifying slow sub-operations.

Display mode for segment events in PRETTY format. Modes: auto - summary in dev mode, off in prod (default) full - Show all segment events with timing summary - Show counts by name + top 5 slowest segments off - Don’t show segment events (still shows warnings and other events)

Priority: SCRIBE_LOG_CHILDS_DISPLAY > config

PropertyValue
Default"auto"
OverrideSCRIBE_LOG_CHILDS_DISPLAY (optional)
monitoring.log.childs.display = "auto"
monitoring.log.childs.display = ${?SCRIBE_LOG_CHILDS_DISPLAY}

Mode: off | minimal | full | auto

auto - Full in dev mode, minimal in prod (default) off - Don’t track segments minimal - Track name, offset, duration only full - Track name, offset, duration, and segment attributes AUTO behavior uses app.mode to determine environment:

  • Dev mode (app.mode=dev/development/local/test) → full
  • Production mode → minimal

Priority: SCRIBE_LOG_CHILDS_MODE > config

PropertyValue
Default"auto"
OverrideSCRIBE_LOG_CHILDS_MODE (optional)
monitoring.log.childs.mode = "auto"
monitoring.log.childs.mode = ${?SCRIBE_LOG_CHILDS_MODE}

Rules for which segments to track. First matching rule wins. Empty list = track all segments. Rule syntax (same as log.rules): { action = include|exclude, name = “glob”, where = “filter” }

Example: Filter sub-millisecond segments:

rules = [
{ action = exclude, where = "duration.seconds<1ms" }
]
PropertyValue
Default[##, Default:, ...]
monitoring.log.childs.rules = [ ## Default: filter sub-millisecond noise in production
## Exclude segments < 1ms
{ action = exclude, where = "duration.seconds<1ms" }
]

Enable wide-event logging.

When disabled, no wide logs are emitted.

PropertyValue
Defaulttrue
OverrideSCRIBE_LOG_ENABLED (optional)
monitoring.log.enabled = true
monitoring.log.enabled = ${?SCRIBE_LOG_ENABLED}

Log format for wide event emission.

Supported formats:

pretty - Human-friendly multi-line format with auto-grouped attributes, segment timeline, and color support (TTY only) (default) json - Single-line JSON payload (machine-parseable) auto - Pretty in dev mode + TTY, JSON otherwise AUTO behavior uses app.mode to determine environment:

  • Dev mode (app.mode=dev/development/local/test) + TTY → pretty with colors
  • Everything else → json
PropertyValue
Default"pretty"
OverrideSCRIBE_LOG_FORMAT (optional)
monitoring.log.format = "pretty"
monitoring.log.format = ${?SCRIBE_LOG_FORMAT}

Leak Detection

Detects segments that remain open longer than expected, indicating potential resource leaks or forgotten end() calls. When triggered, logs a warning with the creation stack trace for debugging.

Mode: off | on | auto

off - Never run leak detection on - Always run leak detection auto - Enabled when app.mode is dev/development/local/test; disabled otherwise

Priority: SCRIBE_LOG_LEAK_DETECTION > config

PropertyValue
Defaultauto
OverrideSCRIBE_LOG_LEAK_DETECTION (optional)
monitoring.log.leak-detection.mode = auto
monitoring.log.leak-detection.mode = ${?SCRIBE_LOG_LEAK_DETECTION}

Rules for leak detection - first match wins. Use name patterns to exclude known long-running operations. No matching rule = include (emit leak warning).

Examples:

  • Exclude long-running workers: { action = exclude, name = “Transcription.*” }
  • Exclude background jobs: { action = exclude, name = “Background.*” }
PropertyValue
Default[##, MCP, ...]
monitoring.log.leak-detection.rules = [ ## MCP uses SSE streaming - connections legitimately stay open indefinitely
{ action = exclude, name = "HTTP * /mcp" }
{ action = exclude, name = "HTTP * /observe/mcp" }
## Async LDAP searches during sync can run indefinitely - not leaks
{ action = exclude, name = "Ingest.AsyncSearch.*" }
## Reconciliation can take minutes for large directories - not a leak
{ action = exclude, name = "Ingest.Reconciliation" }
## Don't warn for segments that haven't exceeded the global threshold
{ action = exclude, where = "leak.duration.seconds<=90s" }
]

Global minimum threshold for leak detection. Segments must be open at least this long to be considered potential leaks. Rules below can add stricter thresholds for specific segment types.

Priority: SCRIBE_LOG_LEAK_DETECTION_THRESHOLD > config

PropertyValue
Default90s
OverrideSCRIBE_LOG_LEAK_DETECTION_THRESHOLD (optional)
monitoring.log.leak-detection.threshold = 90s
monitoring.log.leak-detection.threshold = ${?SCRIBE_LOG_LEAK_DETECTION_THRESHOLD}

Redaction


Emission Rules

Unified rules for controlling which operations are logged. Rules are evaluated in order - first matching rule wins.

Decision Flow:

flowchart TD
A[Operation Ends] --> B{Failure or Warning?}
B -->|Yes| LOG[Log]
B -->|No| C{markInteresting?}
C -->|Yes| LOG
C -->|No| D{First matching rule?}
D -->|include| LOG
D -->|exclude| SUPPRESS[Suppress]
D -->|none| G{Sample rate?}
G -->|Pass| LOG
G -->|Fail| SUPPRESS

Rule syntax:

{ action = include|exclude, name = “glob”, where = “filter” } Fields:

  • action (required): include (log the operation) or exclude (suppress it)
  • name (optional): glob pattern for operation name (e.g., “LDAP.”, ”.ShutDown”)
  • where (optional): filter on attributes. Supports FleX, LDAP, SCIM, and JSON:

FleX: “scribe.result=ok” (preferred - cleaner syntax)

LDAP: “(scribe.result=ok)”

Operation names follow the format {Channel}.{Operation}:

ChannelOperations
LDAPSearch, Bind, Compare, Modify, Add, Delete
RESTSearch, Modify, Add, Delete
GraphQLSearch, Modify, Add, Delete
GRPCSearch, Modify, Add, Delete

Duration attributes (ending in .seconds) support multiple formats:

  • Plain seconds: 0.1, 90, 5.5
  • HOCON style: 100ms, 5s, 1m, 2h
  • ISO 8601: PT90S, PT1M30S, PT1H

Common examples (FleX is the runtime query language; also works in config):

  • Log all LDAP operations: { action = include, name = “LDAP.*” }
  • Suppress fast successful ops: { action = exclude, where = “scribe.result=ok duration.seconds<=50ms” }
  • Log slow operations: { action = include, where = “duration.seconds >= 5s” }
  • Log errors or slow: { action = include, where = “scribe.result != ok or duration.seconds >= 5s” }
  • Using natural language: { action = exclude, where = “scribe.result is ok and duration.seconds is at most 50ms” }
PropertyValue
Default[##, Suppress, ...]
monitoring.log.rules = [ ## Suppress successful LDAP operations under 500ms (higher threshold for DB-backed searches)
{ action = exclude, name = "LDAP.*", where = "scribe.result=ok duration.seconds<=500ms" }
## Suppress fast successful operations (primary use case for noise reduction)
{ action = exclude, where = "scribe.result=ok duration.seconds<=50ms" }
## Suppress fast client errors (4xx) - expected from chaos/load testing
## These are client mistakes, not server issues; only log if slow (potential perf issue)
{ action = exclude, where = "failure.kind=INVALID_ARGUMENT duration.seconds<=100ms" }
{ action = exclude, where = "failure.kind=NOT_FOUND duration.seconds<=100ms" }
## Default noise gates for internal operations
{ action = exclude, name = "Transcription.WorkItem", where = "duration.seconds<=500ms" }
{ action = exclude, name = "Hints.ExplainSampling", where = "duration.seconds<=10s" }
{ action = exclude, name = "Hints.*", where = "duration.seconds<=100ms" }
{ action = exclude, name = "Metrics.*", where = "duration.seconds<=500ms" }
{ action = exclude, name = "*.ShutDown", where = "duration.seconds<=5s" }
{ action = exclude, name = "*.StartUp", where = "duration.seconds<=5s" }
{ action = exclude, name = "Ingest.AsyncSearch.*", where = "duration.seconds<=5s" }
]

Random sampling for operations that don’t match any rule. 0 = never log, 100 = always log (default)

PropertyValue
Default100
OverrideSCRIBE_LOG_SAMPLE_RATE (optional)
monitoring.log.sample-rate = 100
monitoring.log.sample-rate = ${?SCRIBE_LOG_SAMPLE_RATE}

Shutdown Noise Suppression

Controls whether expected shutdown-time errors (connection refused, interrupts, cancellation, pool closed) are downgraded from ERROR to DEBUG.

Priority: SCRIBE_LOG_SHUTDOWN_NOISE > config

Mode: off | on | auto

off - Never suppress; always log as ERROR (useful for debugging shutdown issues) on - Suppress known shutdown noise during shutdown phases auto - Suppress in test and production modes; show in development (default) AUTO behavior uses app.mode to determine environment:

  • Dev mode (app.mode=dev/development/local) → off (show errors for debugging)
  • Test mode (app.mode=test) → on (quiet test output)
  • Production mode → on (suppress expected teardown noise)
PropertyValue
Defaultauto
OverrideSCRIBE_LOG_SHUTDOWN_NOISE (optional)
monitoring.log.shutdown-noise.mode = auto
monitoring.log.shutdown-noise.mode = ${?SCRIBE_LOG_SHUTDOWN_NOISE}

MCP (Model Context Protocol) observe channel

Exposes operational insights to AI coding assistants via:

  • Tools: status, health, doctor, signals, channels, stats
  • Prompts: observe guide, troubleshooting-ops
  • Resources: observe OpenAPI spec, resolved config

The endpoint is fixed at /observe/mcp.

Enable the MCP observe channel

Priority: SCRIBE_MONITORING_MCP_ENABLED > config

PropertyValue
Defaultfalse
OverrideSCRIBE_MONITORING_MCP_ENABLED (optional)
monitoring.mcp.enabled = false
monitoring.mcp.enabled = ${?SCRIBE_MONITORING_MCP_ENABLED}

Socket reference(s) for /observe/mcp endpoint. Can be a single name, comma-separated list, or HOCON list. Falls back to monitoring.socket if unset.

Priority: SCRIBE_MONITORING_MCP_SOCKET > config

Examples:

socket = "admin" # Single socket
socket = "admin, public" # Comma-separated
socket = ["admin", "public"] # HOCON list
PropertyValue
OverrideSCRIBE_MONITORING_MCP_SOCKET (optional)
monitoring.mcp.socket = ${?SCRIBE_MONITORING_MCP_SOCKET}

Metrics configuration (OTLP push, independent of Prometheus scrape)

Enable OTLP metrics push export

Default: false (Prometheus at /metrics is the primary export)

Priority: SCRIBE_METRICS_ENABLED > config

PropertyValue
Defaultfalse
OverrideSCRIBE_METRICS_ENABLED (optional)
monitoring.metrics.enabled = false
monitoring.metrics.enabled = ${?SCRIBE_METRICS_ENABLED}

OTLP endpoint for metrics

Priority: SCRIBE_METRICS_ENDPOINT > OTEL_EXPORTER_OTLP_METRICS_ENDPOINT > OTEL_EXPORTER_OTLP_ENDPOINT > config

PropertyValue
Default"http://localhost:4317"
OverrideSCRIBE_METRICS_ENDPOINT (optional) > OTEL_EXPORTER_OTLP_METRICS_ENDPOINT (standard) > OTEL_EXPORTER_OTLP_ENDPOINT (standard)
monitoring.metrics.endpoint = "http://localhost:4317"
monitoring.metrics.endpoint = ${?OTEL_EXPORTER_OTLP_ENDPOINT}
monitoring.metrics.endpoint = ${?OTEL_EXPORTER_OTLP_METRICS_ENDPOINT}
monitoring.metrics.endpoint = ${?SCRIBE_METRICS_ENDPOINT}

Export interval for OTLP push

Priority: SCRIBE_METRICS_INTERVAL > config

PropertyValue
Default60 seconds
OverrideSCRIBE_METRICS_INTERVAL (optional)
monitoring.metrics.interval = 60 seconds
monitoring.metrics.interval = ${?SCRIBE_METRICS_INTERVAL}

  • When protocol is “http/protobuf” and the endpoint still uses the default gRPC port (4317),

Identity Scribe will automatically rewrite it to port 4318.

  • For “http/protobuf”, if the endpoint does not already include “/v1/metrics”, it will be appended.

Protocol: “grpc” (port 4317) or “http/protobuf” (port 4318)

Priority: SCRIBE_METRICS_PROTOCOL > OTEL_EXPORTER_OTLP_METRICS_PROTOCOL > OTEL_EXPORTER_OTLP_PROTOCOL > config

PropertyValue
Default"grpc"
OverrideSCRIBE_METRICS_PROTOCOL (optional) > OTEL_EXPORTER_OTLP_METRICS_PROTOCOL (standard) > OTEL_EXPORTER_OTLP_PROTOCOL (standard)
monitoring.metrics.protocol = "grpc"
monitoring.metrics.protocol = ${?OTEL_EXPORTER_OTLP_PROTOCOL}
monitoring.metrics.protocol = ${?OTEL_EXPORTER_OTLP_METRICS_PROTOCOL}
monitoring.metrics.protocol = ${?SCRIBE_METRICS_PROTOCOL}

Observe endpoints (/observe, /observe/*) Provides status, doctor, pressure, indexes, hints, signatures, stats, etc.

Enable /observe/* endpoints.

Priority: SCRIBE_OBSERVE_ENABLED > config

PropertyValue
Defaulttrue
OverrideSCRIBE_OBSERVE_ENABLED (optional)
monitoring.observe.enabled = true
monitoring.observe.enabled = ${?SCRIBE_OBSERVE_ENABLED}

Socket for /observe/* endpoints. Falls back to monitoring.socket if unset.

Priority: SCRIBE_OBSERVE_SOCKET > config

PropertyValue
OverrideSCRIBE_OBSERVE_SOCKET (optional)
monitoring.observe.socket = ${?SCRIBE_OBSERVE_SOCKET}

Prometheus scrape endpoint (/metrics)

Enable Prometheus scrape endpoint at /metrics

Priority: SCRIBE_PROMETHEUS_ENABLED > config

PropertyValue
Defaulttrue
OverrideSCRIBE_PROMETHEUS_ENABLED (optional)
monitoring.prometheus.enabled = true
monitoring.prometheus.enabled = ${?SCRIBE_PROMETHEUS_ENABLED}

The interval at which metrics are scraped internally. Background scraping decouples collection latency from HTTP response time. Common Prometheus scrape interval is 60 seconds (default). Default is 30 seconds (1/2 of Prometheus default).

Priority: SCRIBE_PROMETHEUS_SCRAPE_INTERVAL > config

PropertyValue
Default30 seconds
OverrideSCRIBE_PROMETHEUS_SCRAPE_INTERVAL (optional)
monitoring.prometheus.scrape-interval = 30 seconds
monitoring.prometheus.scrape-interval = ${?SCRIBE_PROMETHEUS_SCRAPE_INTERVAL}

Socket for /metrics endpoint. Falls back to monitoring.socket if unset.

Priority: SCRIBE_PROMETHEUS_SOCKET > config

PropertyValue
OverrideSCRIBE_PROMETHEUS_SOCKET (optional)
monitoring.prometheus.socket = ${?SCRIBE_PROMETHEUS_SOCKET}

OpenTelemetry (OTel) configuration

Environment variable overrides (highest priority wins):

  • Priority: SCRIBE_* > OTEL_* > config file > default
  • HOCON substitution: later values override earlier ones

Enablement semantics:

  • OTel SDK is automatically enabled when any metric export is active:
monitoring.prometheus.enabled=true OR monitoring.metrics.enabled=true
  • OTEL_SDK_DISABLED=true disables OTLP exporters but allows Prometheus pull
  • To disable tracing: set monitoring.traces.enabled=false

Resource attributes for all telemetry

Deployment environment (e.g., “production”, “staging”, “dev”)

Priority: SCRIBE_RESOURCE_ENVIRONMENT > OTEL_DEPLOYMENT_ENVIRONMENT > config

PropertyValue
Defaultnull
OverrideSCRIBE_RESOURCE_ENVIRONMENT (optional) > OTEL_DEPLOYMENT_ENVIRONMENT (standard)
monitoring.resource.environment = ${?OTEL_DEPLOYMENT_ENVIRONMENT}
monitoring.resource.environment = ${?SCRIBE_RESOURCE_ENVIRONMENT}

Service name

Priority: SCRIBE_RESOURCE_SERVICE_NAME > OTEL_SERVICE_NAME > config

PropertyValue
Default"identity-scribe"
OverrideSCRIBE_RESOURCE_SERVICE_NAME (optional) > OTEL_SERVICE_NAME (standard)
monitoring.resource.service-name = "identity-scribe"
monitoring.resource.service-name = ${?OTEL_SERVICE_NAME}
monitoring.resource.service-name = ${?SCRIBE_RESOURCE_SERVICE_NAME}

Service version (auto-detected from Implementation-Version if unset)

Priority: SCRIBE_RESOURCE_SERVICE_VERSION > OTEL_SERVICE_VERSION > config

PropertyValue
Defaultnull
OverrideSCRIBE_RESOURCE_SERVICE_VERSION (optional) > OTEL_SERVICE_VERSION (standard)
monitoring.resource.service-version = ${?OTEL_SERVICE_VERSION}
monitoring.resource.service-version = ${?SCRIBE_RESOURCE_SERVICE_VERSION}

Socket reference for monitoring endpoints. Use a named socket from http.sockets.* or omit for @default. If unset (no env var, key omitted, or value is null), defaults to @default.

Priority: SCRIBE_MONITORING_SOCKET > config

PropertyValue
OverrideSCRIBE_MONITORING_SOCKET (optional)
monitoring.socket = ${?SCRIBE_MONITORING_SOCKET}

Trace exporter configuration

Enable trace export

Default: false (opt-in for tracing)

Priority: SCRIBE_TRACES_ENABLED > config

PropertyValue
Defaultfalse
OverrideSCRIBE_TRACES_ENABLED (optional)
monitoring.traces.enabled = false
monitoring.traces.enabled = ${?SCRIBE_TRACES_ENABLED}

OTLP endpoint for traces

Priority: SCRIBE_TRACES_ENDPOINT > OTEL_EXPORTER_OTLP_TRACES_ENDPOINT > OTEL_EXPORTER_OTLP_ENDPOINT > config

PropertyValue
Default"http://localhost:4317"
OverrideSCRIBE_TRACES_ENDPOINT (optional) > OTEL_EXPORTER_OTLP_TRACES_ENDPOINT (standard) > OTEL_EXPORTER_OTLP_ENDPOINT (standard)
monitoring.traces.endpoint = "http://localhost:4317"
monitoring.traces.endpoint = ${?OTEL_EXPORTER_OTLP_ENDPOINT}
monitoring.traces.endpoint = ${?OTEL_EXPORTER_OTLP_TRACES_ENDPOINT}
monitoring.traces.endpoint = ${?SCRIBE_TRACES_ENDPOINT}

  • When protocol is “http/protobuf” and the endpoint still uses the default gRPC port (4317),

Identity Scribe will automatically rewrite it to port 4318.

  • For “http/protobuf”, if the endpoint does not already include “/v1/traces”, it will be appended.

Protocol: “grpc” (port 4317) or “http/protobuf” (port 4318)

Priority: SCRIBE_TRACES_PROTOCOL > OTEL_EXPORTER_OTLP_TRACES_PROTOCOL > OTEL_EXPORTER_OTLP_PROTOCOL > config

PropertyValue
Default"grpc"
OverrideSCRIBE_TRACES_PROTOCOL (optional) > OTEL_EXPORTER_OTLP_TRACES_PROTOCOL (standard) > OTEL_EXPORTER_OTLP_PROTOCOL (standard)
monitoring.traces.protocol = "grpc"
monitoring.traces.protocol = ${?OTEL_EXPORTER_OTLP_PROTOCOL}
monitoring.traces.protocol = ${?OTEL_EXPORTER_OTLP_TRACES_PROTOCOL}
monitoring.traces.protocol = ${?SCRIBE_TRACES_PROTOCOL}

Export timeout

Priority: SCRIBE_TRACES_TIMEOUT > OTEL_EXPORTER_OTLP_TRACES_TIMEOUT > OTEL_EXPORTER_OTLP_TIMEOUT > config

PropertyValue
Default10 seconds
OverrideSCRIBE_TRACES_TIMEOUT (optional) > OTEL_EXPORTER_OTLP_TRACES_TIMEOUT (standard) > OTEL_EXPORTER_OTLP_TIMEOUT (standard)
monitoring.traces.timeout = 10 seconds
monitoring.traces.timeout = ${?OTEL_EXPORTER_OTLP_TIMEOUT}
monitoring.traces.timeout = ${?OTEL_EXPORTER_OTLP_TRACES_TIMEOUT}
monitoring.traces.timeout = ${?SCRIBE_TRACES_TIMEOUT}