Metrics

This page documents all metrics emitted by IdentityScribe. Metric names use OTel dot notation (e.g., scribe.channel.requests.total) which is auto-converted to Prometheus underscore notation for scraping.

Application info metrics

`scribe.info`

Static application information gauge (always 1). Labels contain name, vendor, and version. Use for deployment inventory dashboards.

Property	Value
Prometheus	`scribe_info`
Type	Gauge
Unit	unitless

`scribe.concurrency`

Configured query concurrency limit. Maximum concurrent queries allowed. Compare with inflight metrics for utilization.

Property	Value
Prometheus	`scribe_concurrency`
Type	Gauge
Unit	unitless

`scribe.license.expiration`

License expiration timestamp (Unix epoch seconds). Monitor for upcoming license renewals. Value of 0 indicates no license or perpetual.

Property	Value
Prometheus	`scribe_license_expiration`
Type	Gauge
Unit	seconds

`scribe.db.connections.active`

Aggregate active DB connections across all pools. Used for scheduling thresholds and capacity planning.

Property	Value
Prometheus	`scribe_db_connections_active`
Type	Gauge
Unit	connections

`scribe.db.connections.pending`

Aggregate pending DB connections (threads waiting) across all pools. Non-zero values indicate connection pool exhaustion.

Property	Value
Prometheus	`scribe_db_connections_pending`
Type	Gauge
Unit	connections

`scribe.ldap.connections.active`

Aggregate active LDAP connections across all hosts. Used for scheduling thresholds and upstream load tracking.

Property	Value
Prometheus	`scribe_ldap_connections_active`
Type	Gauge
Unit	connections

`scribe_db_pool_pressure`

DB connection pool pressure (active / max) across all pools. Measures utilization of database connection pools. High values (>0.8) indicate potential connection exhaustion. (0.0 to 1.0)

Property	Value
Prometheus	`scribe_db_pool_pressure`
Type	Gauge
Unit	ratio

`jvm_memory_pressure`

JVM memory pressure (used / max heap). Measures heap utilization. Sustained high values (>0.9) may indicate memory pressure. (0.0 to 1.0)

Property	Value
Prometheus	`jvm_memory_pressure`
Type	Gauge
Unit	ratio

`jvm_thread_count`

JVM thread count. Current number of live threads.

Property	Value
Prometheus	`jvm_thread_count`
Type	Gauge
Unit	threads

`jvm_cpu_count`

JVM CPU count. Number of processors available to the JVM.

Property	Value
Prometheus	`jvm_cpu_count`
Type	Gauge
Unit	cpus

`process_uptime_seconds`

Process uptime in seconds.

Property	Value
Prometheus	`process_uptime_seconds`
Type	Gauge
Unit	seconds

`scribe_query_rejected_5m`

Query rejected count in 5-minute window. Queries rejected due to permit exhaustion. Non-zero values indicate overload.

Property	Value
Prometheus	`scribe_query_rejected_5m`
Type	Gauge
Unit	queries

`scribe_service_restarts_5m`

Service restarts in 5-minute window. Internal service (ingest, maintenance) restarts. Frequent restarts indicate instability.

Property	Value
Prometheus	`scribe_service_restarts_5m`
Type	Gauge
Unit	restarts

`scribe_hints_dropped_5m`

Hints dropped in 5-minute window. Query hints dropped due to queue saturation.

Property	Value
Prometheus	`scribe_hints_dropped_5m`
Type	Gauge
Unit	hints

`scribe_signals_latency_p95`

Channel request latency p95 (aggregate, 5-minute window). 95th percentile response time across all channels.

Property	Value
Prometheus	`scribe_signals_latency_p95`
Type	Gauge
Unit	seconds

`scribe_signals_requests_per_second`

Channel requests per second (aggregate, 5-minute window). Request rate across all channels.

Property	Value
Prometheus	`scribe_signals_requests_per_second`
Type	Gauge
Unit	requests/second

`scribe_signals_traffic_ratio`

Traffic ratio vs baseline (aggregate). Current request rate divided by adaptive baseline. Values >2 indicate traffic spike.

Property	Value
Prometheus	`scribe_signals_traffic_ratio`
Type	Gauge
Unit	ratio

`scribe_signals_error_rate_percent`

Channel error rate percentage (aggregate, 5-minute window). Percentage of requests that resulted in errors. (0-100)

Property	Value
Prometheus	`scribe_signals_error_rate_percent`
Type	Gauge
Unit	percent

`scribe_signals_ingest_task_duration_p95`

Ingest task duration p95 (aggregate, 5-minute window). 95th percentile ingest task processing time.

Property	Value
Prometheus	`scribe_signals_ingest_task_duration_p95`
Type	Gauge
Unit	seconds

`scribe_signals_ingest_lag_max_seconds`

Maximum ingest replication lag across all entry types. Time since last sync from upstream. High values indicate sync backlog.

Property	Value
Prometheus	`scribe_signals_ingest_lag_max_seconds`
Type	Gauge
Unit	seconds

`scribe_signals_ingest_changes_per_second`

Ingest changes per second (aggregate, 5-minute window). Rate of change detection from upstream.

Property	Value
Prometheus	`scribe_signals_ingest_changes_per_second`
Type	Gauge
Unit	changes/second

`scribe_signals_ingest_events_per_second`

Ingest events per second (aggregate, 5-minute window). Rate of events written to store.

Property	Value
Prometheus	`scribe_signals_ingest_events_per_second`
Type	Gauge
Unit	events/second

`scribe_signals_ingest_failed_rate_percent`

Ingest task failure rate percentage (aggregate, 5-minute window). Percentage of ingest tasks that failed (per task, not per change). (0-100)

Property	Value
Prometheus	`scribe_signals_ingest_failed_rate_percent`
Type	Gauge
Unit	percent

`scribe.signals.channel.latency.p95`

Channel request latency p95 by channel.

Property	Value
Prometheus	`scribe_signals_channel_latency_p95`
Type	Gauge
Unit	seconds
Dimensions	`channel`

`scribe.signals.channel.requests.per.second`

Channel requests per second by channel.

Property	Value
Prometheus	`scribe_signals_channel_requests_per_second`
Type	Gauge
Unit	requests/second
Dimensions	`channel`

`scribe.signals.channel.error.rate.percent`

Channel error rate percentage by channel. (0-100)

Property	Value
Prometheus	`scribe_signals_channel_error_rate_percent`
Type	Gauge
Unit	percent
Dimensions	`channel`

`scribe.signals.ingest.entry.task.duration.p95`

Ingest task duration p95 by entry type.

Property	Value
Prometheus	`scribe_signals_ingest_entry_task_duration_p95`
Type	Gauge
Unit	seconds
Dimensions	`entry_type`

`scribe.signals.ingest.entry.changes.per.second`

Ingest changes per second by entry type.

Property	Value
Prometheus	`scribe_signals_ingest_entry_changes_per_second`
Type	Gauge
Unit	changes/second
Dimensions	`entry_type`

`scribe.signals.ingest.entry.events.per.second`

Ingest events per second by entry type.

Property	Value
Prometheus	`scribe_signals_ingest_entry_events_per_second`
Type	Gauge
Unit	events/second
Dimensions	`entry_type`

`scribe.signals.ingest.entry.failed.rate.percent`

Ingest task failure rate percentage by entry type. (0-100)

Property	Value
Prometheus	`scribe_signals_ingest_entry_failed_rate_percent`
Type	Gauge
Unit	percent
Dimensions	`entry_type`

`scribe.signals.ingest.entry.lag.seconds`

Ingest replication lag by entry type.

Property	Value
Prometheus	`scribe_signals_ingest_entry_lag_seconds`
Type	Gauge
Unit	seconds
Dimensions	`entry_type`

Channel connection metrics

`scribe.channel.connections.total`

Total connections established to protocol channels. Counts new connections (not requests).

Property	Value
Prometheus	`scribe_channel_connections_total`
Type	Counter
Unit	connections
Dimensions	`channel`, `scribe.socket`

`scribe.channel.connections.active`

Current number of active connections. Open client connections across all channels.

Property	Value
Prometheus	`scribe_channel_connections_active`
Type	Gauge
Unit	connections
Dimensions	`channel`, `scribe.socket`

`scribe.channel.forwarded.total`

Total requests forwarded to upstream directory. Requests that could not be served locally.

Property	Value
Prometheus	`scribe_channel_forwarded_total`
Type	Counter
Unit	requests
Dimensions	`channel`, `op`

Channel metrics

`scribe.channel.requests.total`

Total requests received across all protocol channels. Incremented at the end of each request regardless of outcome. Filter by the result dimension to separate successful from failed requests.

Property	Value
Prometheus	`scribe_channel_requests_total`
Type	Counter
Unit	requests
Dimensions	`channel`, `op`, `result`

`scribe.channel.request.duration.seconds`

Duration of request processing in seconds. Measures wall-clock time from request receipt to response completion. Use percentiles (p50, p95, p99) for latency SLOs.

Property	Value
Prometheus	`scribe_channel_request_duration_seconds`
Type	Histogram
Unit	seconds
Dimensions	`channel`, `op`, `result`

`scribe.channel.inflight`

Current number of in-flight requests. Useful for detecting request queuing and concurrency limits. A sustained high value may indicate resource exhaustion.

Property	Value
Prometheus	`scribe_channel_inflight`
Type	UpDownCounter
Unit	requests
Dimensions	`channel`

Hints metrics

`scribe.hints.queue.size`

Current number of hints in the persistence queue. Hints are query patterns used for index recommendations. A growing queue indicates high query diversity.

Property	Value
Prometheus	`scribe_hints_queue_size`
Type	Gauge
Unit	unitless

`scribe.hints.queue.dropped.total`

Total hints dropped due to queue overflow. Indicates the hint queue is saturated. Consider increasing queue capacity if dropping frequently.

Property	Value
Prometheus	`scribe_hints_queue_dropped_total`
Type	Counter
Unit	unitless

`scribe.hints.persisted.total`

Total hints successfully persisted. Counts query patterns saved for analysis.

Property	Value
Prometheus	`scribe_hints_persisted_total`
Type	Counter
Unit	unitless

`scribe.hints.excluded.total`

Total hints excluded by persistence rules. Counts hints that matched an exclude rule and were not persisted to database. These hints are still emitted via metrics and logs.

Property	Value
Prometheus	`scribe_hints_excluded_total`
Type	Counter
Unit	unitless
Dimensions	`hint_type`

Index hints metrics

`scribe.hints.index.usage.total`

Total queries that used an index effectively. Filter by category and attribute dimensions.

Property	Value
Prometheus	`scribe_hints_index_usage_total`
Type	Counter
Unit	unitless
Dimensions	`category`, `attribute`

`scribe.hints.index.recommendation.total`

Total index recommendations generated. Missing indexes detected from query patterns.

Property	Value
Prometheus	`scribe_hints_index_recommendation_total`
Type	Counter
Unit	unitless
Dimensions	`category`, `attribute`

`scribe.hints.index.seqscan.total`

Total queries that triggered sequential scans. Queries without suitable index support. Consider adding indexes for frequent patterns.

Property	Value
Prometheus	`scribe_hints_index_seqscan_total`
Type	Counter
Unit	unitless
Dimensions	`category`, `attribute`

`scribe.hints.seqscan.suppressed.total`

Total sequential scan hints suppressed due to fast query with no actionable diagnosis.

Property	Value
Prometheus	`scribe_hints_seqscan_suppressed_total`
Type	Counter
Unit	unitless

Ingest metrics

`scribe.ingest.lag.seconds`

Current replication lag in seconds per entry type. Time since the last change from the upstream directory was processed. High lag indicates slow synchronization or upstream issues.

Property	Value
Prometheus	`scribe_ingest_lag_seconds`
Type	Gauge
Unit	seconds
Dimensions	`entry_type`

`scribe.ingest.queue.pressure`

Current ingest queue utilization ratio (0.0 to 1.0). Ratio of pending changes to queue capacity. Values near 1.0 indicate change processing cannot keep up.

Property	Value
Prometheus	`scribe_ingest_queue_pressure`
Type	Gauge
Unit	unitless
Dimensions	`entry_type`

`scribe.ingest.task.pressure`

Current ingest task slot utilization ratio (0.0 to 1.0). Ratio of active ingest tasks to maximum allowed. High values indicate heavy synchronization activity.

Property	Value
Prometheus	`scribe_ingest_task_pressure`
Type	Gauge
Unit	unitless
Dimensions	`entry_type`

`scribe.ingest.changes.total`

Total directory changes detected from upstream. Counts add, modify, delete, and move events. Filter by entry_type and event_type dimensions.

Property	Value
Prometheus	`scribe_ingest_changes_total`
Type	Counter
Unit	events
Dimensions	`entry_type`, `event_type`

`scribe.ingest.events.written.total`

Total change events successfully written to the store. Difference between detected changes and written events indicates filtered or deduplicated entries.

Property	Value
Prometheus	`scribe_ingest_events_written_total`
Type	Counter
Unit	events
Dimensions	`entry_type`, `event_type`

`scribe.ingest.tasks.active`

Current number of active ingest tasks. Tasks process batches of directory changes. A sustained high value indicates heavy sync load.

Property	Value
Prometheus	`scribe_ingest_tasks_active`
Type	Gauge
Unit	tasks
Dimensions	`entry_type`

`scribe.ingest.tasks.throttled`

Current number of throttled ingest tasks. Tasks paused due to rate limiting or backpressure. Non-zero values indicate the system is self-regulating.

Property	Value
Prometheus	`scribe_ingest_tasks_throttled`
Type	Gauge
Unit	tasks
Dimensions	`entry_type`

`scribe.ingest.tasks.total`

Total ingest tasks started. Compare with failed total to calculate success rate.

Property	Value
Prometheus	`scribe_ingest_tasks_total`
Type	Counter
Unit	tasks
Dimensions	`entry_type`

`scribe.ingest.tasks.failed.total`

Total ingest tasks that failed. Includes retryable and non-retryable failures. Check logs for failure details.

Property	Value
Prometheus	`scribe_ingest_tasks_failed_total`
Type	Counter
Unit	tasks
Dimensions	`entry_type`

`scribe.ingest.task.duration.seconds`

Duration of ingest task execution. Time to process a batch of changes including DB writes. High durations may indicate database performance issues.

Property	Value
Prometheus	`scribe_ingest_task_duration_seconds`
Type	Histogram
Unit	seconds
Dimensions	`entry_type`

`scribe.ingest.throttle.wait.seconds`

Time spent waiting due to throttling. Cumulative delay from rate limiting. Monitors the cost of backpressure.

Property	Value
Prometheus	`scribe_ingest_throttle_wait_seconds`
Type	Histogram
Unit	seconds
Dimensions	`entry_type`

`scribe.ingest.unchanged.total`

Total entries skipped because they were unchanged. Directory reported a change but entry content was identical. High values may indicate noisy upstream notifications.

Property	Value
Prometheus	`scribe_ingest_unchanged_total`
Type	Counter
Unit	entries
Dimensions	`entry_type`

`scribe.ingest.skipped.total`

Total entries skipped due to filter rules. Entries excluded by transcribe configuration filters.

Property	Value
Prometheus	`scribe_ingest_skipped_total`
Type	Counter
Unit	entries
Dimensions	`entry_type`

`scribe.ingest.queue.blocked.total`

Total times ingest was blocked waiting for queue capacity. Indicates queue saturation events. Frequent blocking suggests increasing queue size.

Property	Value
Prometheus	`scribe_ingest_queue_blocked_total`
Type	Counter
Unit	events
Dimensions	`entry_type`

`scribe.ingest.queue.wait.seconds`

Time spent waiting for queue capacity. Measures blocking delay when queues are full.

Property	Value
Prometheus	`scribe_ingest_queue_wait_seconds`
Type	Histogram
Unit	seconds
Dimensions	`entry_type`

Maintenance metrics

`scribe.maintenance.duration.seconds`

Duration of maintenance operations. Periodic cleanup tasks like vacuuming and statistics refresh.

Property	Value
Prometheus	`scribe_maintenance_duration_seconds`
Type	Histogram
Unit	seconds

`scribe.maintenance.failed.total`

Total maintenance operations that failed. Check logs for failure reasons.

Property	Value
Prometheus	`scribe_maintenance_failed_total`
Type	Counter
Unit	unitless

Query pipeline metrics

`scribe.query.stage.duration.seconds`

Duration of individual query pipeline stages. Helps identify which stage is the bottleneck (parse, normalize, plan, compile, execute, map, encode). Filter by stage dimension for per-stage analysis.

Property	Value
Prometheus	`scribe_query_stage_duration_seconds`
Type	Histogram
Unit	seconds
Dimensions	`channel`, `stage`

`scribe.query.shapes.total`

Total queries by shape classification. Tracks query patterns: base lookups, list queries, paginated queries. Useful for understanding query workload characteristics.

Property	Value
Prometheus	`scribe_query_shapes_total`
Type	Counter
Unit	requests
Dimensions	`channel`, `shape`

`scribe.query.permit.pressure`

Current query permit utilization ratio (0.0 to 1.0). Indicates how close the system is to query concurrency limits. Values approaching 1.0 suggest imminent query rejection.

Property	Value
Prometheus	`scribe_query_permit_pressure`
Type	Gauge
Unit	unitless

`scribe.query.permit.queue`

Number of threads waiting to acquire query permits. Indicates queuing pressure. High values suggest queries are waiting longer than expected for permits. Monitor alongside permit_pressure to understand concurrency bottlenecks.

Property	Value
Prometheus	`scribe_query_permit_queue`
Type	Gauge
Unit	threads

`scribe.query.rejected.total`

Total queries rejected due to resource limits. Indicates backpressure is being applied. Check result dimension for rejection reason (resource_exhausted, deadline_exceeded, etc.).

Property	Value
Prometheus	`scribe_query_rejected_total`
Type	Counter
Unit	requests
Dimensions	`channel`, `result`

Reconciliation metrics

`scribe.reconciliation.entries.verified.total`

Total entries verified during reconciliation. Full-sync operations compare cached entries with upstream.

Property	Value
Prometheus	`scribe_reconciliation_entries_verified_total`
Type	Counter
Unit	entries
Dimensions	`entry_type`

`scribe.reconciliation.entries.deleted.total`

Total stale entries deleted during reconciliation. Entries that exist locally but not upstream are removed.

Property	Value
Prometheus	`scribe_reconciliation_entries_deleted_total`
Type	Counter
Unit	entries
Dimensions	`entry_type`

`scribe.reconciliation.duration.seconds`

Duration of reconciliation operations. Full-sync duration depends on entry count and upstream latency.

Property	Value
Prometheus	`scribe_reconciliation_duration_seconds`
Type	Histogram
Unit	seconds
Dimensions	`entry_type`

Segment leak detection metrics

`scribe.telemetry.segment.leaks.total`

Total segments that exceeded the leak detection threshold. Indicates potential resource leaks in span tracking. Investigate if this counter increases.

Property	Value
Prometheus	`scribe_telemetry_segment_leaks_total`
Type	Counter
Unit	unitless

`scribe.telemetry.segment.leaks.ended_after.total`

Total leaked segments that eventually ended (post-leak closure). Segments closed after being flagged as leaks. Useful for distinguishing true leaks from slow operations.

Property	Value
Prometheus	`scribe_telemetry_segment_leaks_ended_after_total`
Type	Counter
Unit	unitless

Service metrics

`scribe.service.restarts.total`

Total restarts of internal services. Counts automatic recovery attempts. Frequent restarts indicate instability.

Property	Value
Prometheus	`scribe_service_restarts_total`
Type	Counter
Unit	unitless
Dimensions	`service`

`scribe.service.up`

Current service health status (1=up, 0=down). Use for availability dashboards and alerts. A value of 0 indicates the service is not running.

Property	Value
Prometheus	`scribe_service_up`
Type	Gauge
Unit	unitless
Dimensions	`service`

`scribe.service.transitions.total`

Total service state transitions. Counts state changes like starting, running, stopping. Filter by from and to dimensions.

Property	Value
Prometheus	`scribe_service_transitions_total`
Type	Counter
Unit	unitless
Dimensions	`service`, `from`, `to`

`scribe.service.transition.duration.seconds`

Duration of service state transitions. Time spent transitioning between states. Long transitions may indicate initialization issues.

Property	Value
Prometheus	`scribe_service_transition_duration_seconds`
Type	Histogram
Unit	seconds
Dimensions	`service`, `from`, `to`

Store metrics

`scribe.store.commit.duration.seconds`

Duration of store commit operations. Time to persist changes to the database. Spikes may indicate transaction contention.

Property	Value
Prometheus	`scribe_store_commit_duration_seconds`
Type	Histogram
Unit	seconds
Dimensions	`entry_type`

`scribe.store.commit.wait.duration.seconds`

Time spent waiting for commit lock. Serialization delay when multiple writers contend. High values suggest commit throughput bottleneck.

Property	Value
Prometheus	`scribe_store_commit_wait_duration_seconds`
Type	Histogram
Unit	seconds
Dimensions	`entry_type`

JVM / Process

These are standard OTel runtime metrics. See OTel Semantic Conventions for details.

`jvm.memory.used`

Property	Value
Prometheus	`jvm_memory_used`

`jvm.memory.committed`

Property	Value
Prometheus	`jvm_memory_committed`

`jvm.memory.limit`

Property	Value
Prometheus	`jvm_memory_limit`

`jvm.memory.pressure`

Property	Value
Prometheus	`jvm_memory_pressure`

`jvm.thread.count`

Property	Value
Prometheus	`jvm_thread_count`

`jvm.cpu.count`

Property	Value
Prometheus	`jvm_cpu_count`

`process.uptime`

Property	Value
Prometheus	`process_uptime`

Metrics

Categories

Application info metrics

scribe.info

scribe.concurrency

scribe.license.expiration

scribe.db.connections.active

scribe.db.connections.pending

scribe.ldap.connections.active

scribe_db_pool_pressure

jvm_memory_pressure

jvm_thread_count

jvm_cpu_count

process_uptime_seconds

scribe_query_rejected_5m

scribe_service_restarts_5m

scribe_hints_dropped_5m

scribe_signals_latency_p95

scribe_signals_requests_per_second

scribe_signals_traffic_ratio

scribe_signals_error_rate_percent

scribe_signals_ingest_task_duration_p95

scribe_signals_ingest_lag_max_seconds

scribe_signals_ingest_changes_per_second

scribe_signals_ingest_events_per_second

scribe_signals_ingest_failed_rate_percent

scribe.signals.channel.latency.p95

scribe.signals.channel.requests.per.second

scribe.signals.channel.error.rate.percent

scribe.signals.ingest.entry.task.duration.p95

scribe.signals.ingest.entry.changes.per.second

scribe.signals.ingest.entry.events.per.second

scribe.signals.ingest.entry.failed.rate.percent

scribe.signals.ingest.entry.lag.seconds

Channel connection metrics

scribe.channel.connections.total

scribe.channel.connections.active

scribe.channel.forwarded.total