This page documents all metrics emitted by IdentityScribe. Metric names use OTel dot notation (e.g., scribe.channel.requests.total) which is auto-converted to Prometheus underscore notation for scraping.
Static application information gauge (always 1). Labels contain name, vendor, and version. Use for deployment inventory dashboards.
| Property | Value |
|---|
| Prometheus | scribe_info |
| Type | Gauge |
| Unit | unitless |
Configured query concurrency limit. Maximum concurrent queries allowed. Compare with inflight metrics for utilization.
| Property | Value |
|---|
| Prometheus | scribe_concurrency |
| Type | Gauge |
| Unit | unitless |
License expiration timestamp (Unix epoch seconds). Monitor for upcoming license renewals. Value of 0 indicates no license or perpetual.
| Property | Value |
|---|
| Prometheus | scribe_license_expiration |
| Type | Gauge |
| Unit | seconds |
Aggregate active DB connections across all pools. Used for scheduling thresholds and capacity planning.
| Property | Value |
|---|
| Prometheus | scribe_db_connections_active |
| Type | Gauge |
| Unit | connections |
Aggregate pending DB connections (threads waiting) across all pools. Non-zero values indicate connection pool exhaustion.
| Property | Value |
|---|
| Prometheus | scribe_db_connections_pending |
| Type | Gauge |
| Unit | connections |
Aggregate active LDAP connections across all hosts. Used for scheduling thresholds and upstream load tracking.
| Property | Value |
|---|
| Prometheus | scribe_ldap_connections_active |
| Type | Gauge |
| Unit | connections |
DB connection pool pressure (active / max) across all pools. Measures utilization of database connection pools. High values (>0.8) indicate potential connection exhaustion. (0.0 to 1.0)
| Property | Value |
|---|
| Prometheus | scribe_db_pool_pressure |
| Type | Gauge |
| Unit | ratio |
JVM memory pressure (used / max heap). Measures heap utilization. Sustained high values (>0.9) may indicate memory pressure. (0.0 to 1.0)
| Property | Value |
|---|
| Prometheus | jvm_memory_pressure |
| Type | Gauge |
| Unit | ratio |
JVM thread count. Current number of live threads.
| Property | Value |
|---|
| Prometheus | jvm_thread_count |
| Type | Gauge |
| Unit | threads |
JVM CPU count. Number of processors available to the JVM.
| Property | Value |
|---|
| Prometheus | jvm_cpu_count |
| Type | Gauge |
| Unit | cpus |
Process uptime in seconds.
| Property | Value |
|---|
| Prometheus | process_uptime_seconds |
| Type | Gauge |
| Unit | seconds |
Query rejected count in 5-minute window. Queries rejected due to permit exhaustion. Non-zero values indicate overload.
| Property | Value |
|---|
| Prometheus | scribe_query_rejected_5m |
| Type | Gauge |
| Unit | queries |
Service restarts in 5-minute window. Internal service (ingest, maintenance) restarts. Frequent restarts indicate instability.
| Property | Value |
|---|
| Prometheus | scribe_service_restarts_5m |
| Type | Gauge |
| Unit | restarts |
Hints dropped in 5-minute window. Query hints dropped due to queue saturation.
| Property | Value |
|---|
| Prometheus | scribe_hints_dropped_5m |
| Type | Gauge |
| Unit | hints |
Channel request latency p95 (aggregate, 5-minute window). 95th percentile response time across all channels.
| Property | Value |
|---|
| Prometheus | scribe_signals_latency_p95 |
| Type | Gauge |
| Unit | seconds |
Channel requests per second (aggregate, 5-minute window). Request rate across all channels.
| Property | Value |
|---|
| Prometheus | scribe_signals_requests_per_second |
| Type | Gauge |
| Unit | requests/second |
Traffic ratio vs baseline (aggregate). Current request rate divided by adaptive baseline. Values >2 indicate traffic spike.
| Property | Value |
|---|
| Prometheus | scribe_signals_traffic_ratio |
| Type | Gauge |
| Unit | ratio |
Channel error rate percentage (aggregate, 5-minute window). Percentage of requests that resulted in errors. (0-100)
| Property | Value |
|---|
| Prometheus | scribe_signals_error_rate_percent |
| Type | Gauge |
| Unit | percent |
Ingest task duration p95 (aggregate, 5-minute window). 95th percentile ingest task processing time.
| Property | Value |
|---|
| Prometheus | scribe_signals_ingest_task_duration_p95 |
| Type | Gauge |
| Unit | seconds |
Maximum ingest replication lag across all entry types. Time since last sync from upstream. High values indicate sync backlog.
| Property | Value |
|---|
| Prometheus | scribe_signals_ingest_lag_max_seconds |
| Type | Gauge |
| Unit | seconds |
Ingest changes per second (aggregate, 5-minute window). Rate of change detection from upstream.
| Property | Value |
|---|
| Prometheus | scribe_signals_ingest_changes_per_second |
| Type | Gauge |
| Unit | changes/second |
Ingest events per second (aggregate, 5-minute window). Rate of events written to store.
| Property | Value |
|---|
| Prometheus | scribe_signals_ingest_events_per_second |
| Type | Gauge |
| Unit | events/second |
Ingest task failure rate percentage (aggregate, 5-minute window). Percentage of ingest tasks that failed (per task, not per change). (0-100)
| Property | Value |
|---|
| Prometheus | scribe_signals_ingest_failed_rate_percent |
| Type | Gauge |
| Unit | percent |
Channel request latency p95 by channel.
| Property | Value |
|---|
| Prometheus | scribe_signals_channel_latency_p95 |
| Type | Gauge |
| Unit | seconds |
| Dimensions | channel |
Channel requests per second by channel.
| Property | Value |
|---|
| Prometheus | scribe_signals_channel_requests_per_second |
| Type | Gauge |
| Unit | requests/second |
| Dimensions | channel |
Channel error rate percentage by channel. (0-100)
| Property | Value |
|---|
| Prometheus | scribe_signals_channel_error_rate_percent |
| Type | Gauge |
| Unit | percent |
| Dimensions | channel |
Ingest task duration p95 by entry type.
| Property | Value |
|---|
| Prometheus | scribe_signals_ingest_entry_task_duration_p95 |
| Type | Gauge |
| Unit | seconds |
| Dimensions | entry_type |
Ingest changes per second by entry type.
| Property | Value |
|---|
| Prometheus | scribe_signals_ingest_entry_changes_per_second |
| Type | Gauge |
| Unit | changes/second |
| Dimensions | entry_type |
Ingest events per second by entry type.
| Property | Value |
|---|
| Prometheus | scribe_signals_ingest_entry_events_per_second |
| Type | Gauge |
| Unit | events/second |
| Dimensions | entry_type |
Ingest task failure rate percentage by entry type. (0-100)
| Property | Value |
|---|
| Prometheus | scribe_signals_ingest_entry_failed_rate_percent |
| Type | Gauge |
| Unit | percent |
| Dimensions | entry_type |
Ingest replication lag by entry type.
| Property | Value |
|---|
| Prometheus | scribe_signals_ingest_entry_lag_seconds |
| Type | Gauge |
| Unit | seconds |
| Dimensions | entry_type |
Total connections established to protocol channels. Counts new connections (not requests).
| Property | Value |
|---|
| Prometheus | scribe_channel_connections_total |
| Type | Counter |
| Unit | connections |
| Dimensions | channel, scribe.socket |
Current number of active connections. Open client connections across all channels.
| Property | Value |
|---|
| Prometheus | scribe_channel_connections_active |
| Type | Gauge |
| Unit | connections |
| Dimensions | channel, scribe.socket |
Total requests forwarded to upstream directory. Requests that could not be served locally.
| Property | Value |
|---|
| Prometheus | scribe_channel_forwarded_total |
| Type | Counter |
| Unit | requests |
| Dimensions | channel, op |
Total requests received across all protocol channels. Incremented at the end of each request regardless of outcome. Filter by the result dimension to separate successful from failed requests.
| Property | Value |
|---|
| Prometheus | scribe_channel_requests_total |
| Type | Counter |
| Unit | requests |
| Dimensions | channel, op, result |
Duration of request processing in seconds. Measures wall-clock time from request receipt to response completion. Use percentiles (p50, p95, p99) for latency SLOs.
| Property | Value |
|---|
| Prometheus | scribe_channel_request_duration_seconds |
| Type | Histogram |
| Unit | seconds |
| Dimensions | channel, op, result |
Current number of in-flight requests. Useful for detecting request queuing and concurrency limits. A sustained high value may indicate resource exhaustion.
| Property | Value |
|---|
| Prometheus | scribe_channel_inflight |
| Type | UpDownCounter |
| Unit | requests |
| Dimensions | channel |
Current number of hints in the persistence queue. Hints are query patterns used for index recommendations. A growing queue indicates high query diversity.
| Property | Value |
|---|
| Prometheus | scribe_hints_queue_size |
| Type | Gauge |
| Unit | unitless |
Total hints dropped due to queue overflow. Indicates the hint queue is saturated. Consider increasing queue capacity if dropping frequently.
| Property | Value |
|---|
| Prometheus | scribe_hints_queue_dropped_total |
| Type | Counter |
| Unit | unitless |
Total hints successfully persisted. Counts query patterns saved for analysis.
| Property | Value |
|---|
| Prometheus | scribe_hints_persisted_total |
| Type | Counter |
| Unit | unitless |
Total hints excluded by persistence rules. Counts hints that matched an exclude rule and were not persisted to database. These hints are still emitted via metrics and logs.
| Property | Value |
|---|
| Prometheus | scribe_hints_excluded_total |
| Type | Counter |
| Unit | unitless |
| Dimensions | hint_type |
Total queries that used an index effectively. Filter by category and attribute dimensions.
| Property | Value |
|---|
| Prometheus | scribe_hints_index_usage_total |
| Type | Counter |
| Unit | unitless |
| Dimensions | category, attribute |
Total index recommendations generated. Missing indexes detected from query patterns.
| Property | Value |
|---|
| Prometheus | scribe_hints_index_recommendation_total |
| Type | Counter |
| Unit | unitless |
| Dimensions | category, attribute |
Total queries that triggered sequential scans. Queries without suitable index support. Consider adding indexes for frequent patterns.
| Property | Value |
|---|
| Prometheus | scribe_hints_index_seqscan_total |
| Type | Counter |
| Unit | unitless |
| Dimensions | category, attribute |
Total sequential scan hints suppressed due to fast query with no actionable diagnosis.
| Property | Value |
|---|
| Prometheus | scribe_hints_seqscan_suppressed_total |
| Type | Counter |
| Unit | unitless |
Current replication lag in seconds per entry type. Time since the last change from the upstream directory was processed. High lag indicates slow synchronization or upstream issues.
| Property | Value |
|---|
| Prometheus | scribe_ingest_lag_seconds |
| Type | Gauge |
| Unit | seconds |
| Dimensions | entry_type |
Current ingest queue utilization ratio (0.0 to 1.0). Ratio of pending changes to queue capacity. Values near 1.0 indicate change processing cannot keep up.
| Property | Value |
|---|
| Prometheus | scribe_ingest_queue_pressure |
| Type | Gauge |
| Unit | unitless |
| Dimensions | entry_type |
Current ingest task slot utilization ratio (0.0 to 1.0). Ratio of active ingest tasks to maximum allowed. High values indicate heavy synchronization activity.
| Property | Value |
|---|
| Prometheus | scribe_ingest_task_pressure |
| Type | Gauge |
| Unit | unitless |
| Dimensions | entry_type |
Total directory changes detected from upstream. Counts add, modify, delete, and move events. Filter by entry_type and event_type dimensions.
| Property | Value |
|---|
| Prometheus | scribe_ingest_changes_total |
| Type | Counter |
| Unit | events |
| Dimensions | entry_type, event_type |
Total change events successfully written to the store. Difference between detected changes and written events indicates filtered or deduplicated entries.
| Property | Value |
|---|
| Prometheus | scribe_ingest_events_written_total |
| Type | Counter |
| Unit | events |
| Dimensions | entry_type, event_type |
Current number of active ingest tasks. Tasks process batches of directory changes. A sustained high value indicates heavy sync load.
| Property | Value |
|---|
| Prometheus | scribe_ingest_tasks_active |
| Type | Gauge |
| Unit | tasks |
| Dimensions | entry_type |
Current number of throttled ingest tasks. Tasks paused due to rate limiting or backpressure. Non-zero values indicate the system is self-regulating.
| Property | Value |
|---|
| Prometheus | scribe_ingest_tasks_throttled |
| Type | Gauge |
| Unit | tasks |
| Dimensions | entry_type |
Total ingest tasks started. Compare with failed total to calculate success rate.
| Property | Value |
|---|
| Prometheus | scribe_ingest_tasks_total |
| Type | Counter |
| Unit | tasks |
| Dimensions | entry_type |
Total ingest tasks that failed. Includes retryable and non-retryable failures. Check logs for failure details.
| Property | Value |
|---|
| Prometheus | scribe_ingest_tasks_failed_total |
| Type | Counter |
| Unit | tasks |
| Dimensions | entry_type |
Duration of ingest task execution. Time to process a batch of changes including DB writes. High durations may indicate database performance issues.
| Property | Value |
|---|
| Prometheus | scribe_ingest_task_duration_seconds |
| Type | Histogram |
| Unit | seconds |
| Dimensions | entry_type |
Time spent waiting due to throttling. Cumulative delay from rate limiting. Monitors the cost of backpressure.
| Property | Value |
|---|
| Prometheus | scribe_ingest_throttle_wait_seconds |
| Type | Histogram |
| Unit | seconds |
| Dimensions | entry_type |
Total entries skipped because they were unchanged. Directory reported a change but entry content was identical. High values may indicate noisy upstream notifications.
| Property | Value |
|---|
| Prometheus | scribe_ingest_unchanged_total |
| Type | Counter |
| Unit | entries |
| Dimensions | entry_type |
Total entries skipped due to filter rules. Entries excluded by transcribe configuration filters.
| Property | Value |
|---|
| Prometheus | scribe_ingest_skipped_total |
| Type | Counter |
| Unit | entries |
| Dimensions | entry_type |
Total times ingest was blocked waiting for queue capacity. Indicates queue saturation events. Frequent blocking suggests increasing queue size.
| Property | Value |
|---|
| Prometheus | scribe_ingest_queue_blocked_total |
| Type | Counter |
| Unit | events |
| Dimensions | entry_type |
Time spent waiting for queue capacity. Measures blocking delay when queues are full.
| Property | Value |
|---|
| Prometheus | scribe_ingest_queue_wait_seconds |
| Type | Histogram |
| Unit | seconds |
| Dimensions | entry_type |
Duration of maintenance operations. Periodic cleanup tasks like vacuuming and statistics refresh.
| Property | Value |
|---|
| Prometheus | scribe_maintenance_duration_seconds |
| Type | Histogram |
| Unit | seconds |
Total maintenance operations that failed. Check logs for failure reasons.
| Property | Value |
|---|
| Prometheus | scribe_maintenance_failed_total |
| Type | Counter |
| Unit | unitless |
Duration of individual query pipeline stages. Helps identify which stage is the bottleneck (parse, normalize, plan, compile, execute, map, encode). Filter by stage dimension for per-stage analysis.
| Property | Value |
|---|
| Prometheus | scribe_query_stage_duration_seconds |
| Type | Histogram |
| Unit | seconds |
| Dimensions | channel, stage |
Total queries by shape classification. Tracks query patterns: base lookups, list queries, paginated queries. Useful for understanding query workload characteristics.
| Property | Value |
|---|
| Prometheus | scribe_query_shapes_total |
| Type | Counter |
| Unit | requests |
| Dimensions | channel, shape |
Current query permit utilization ratio (0.0 to 1.0). Indicates how close the system is to query concurrency limits. Values approaching 1.0 suggest imminent query rejection.
| Property | Value |
|---|
| Prometheus | scribe_query_permit_pressure |
| Type | Gauge |
| Unit | unitless |
Number of threads waiting to acquire query permits. Indicates queuing pressure. High values suggest queries are waiting longer than expected for permits. Monitor alongside permit_pressure to understand concurrency bottlenecks.
| Property | Value |
|---|
| Prometheus | scribe_query_permit_queue |
| Type | Gauge |
| Unit | threads |
Total queries rejected due to resource limits. Indicates backpressure is being applied. Check result dimension for rejection reason (resource_exhausted, deadline_exceeded, etc.).
| Property | Value |
|---|
| Prometheus | scribe_query_rejected_total |
| Type | Counter |
| Unit | requests |
| Dimensions | channel, result |
Total entries verified during reconciliation. Full-sync operations compare cached entries with upstream.
| Property | Value |
|---|
| Prometheus | scribe_reconciliation_entries_verified_total |
| Type | Counter |
| Unit | entries |
| Dimensions | entry_type |
Total stale entries deleted during reconciliation. Entries that exist locally but not upstream are removed.
| Property | Value |
|---|
| Prometheus | scribe_reconciliation_entries_deleted_total |
| Type | Counter |
| Unit | entries |
| Dimensions | entry_type |
Duration of reconciliation operations. Full-sync duration depends on entry count and upstream latency.
| Property | Value |
|---|
| Prometheus | scribe_reconciliation_duration_seconds |
| Type | Histogram |
| Unit | seconds |
| Dimensions | entry_type |
Total segments that exceeded the leak detection threshold. Indicates potential resource leaks in span tracking. Investigate if this counter increases.
| Property | Value |
|---|
| Prometheus | scribe_telemetry_segment_leaks_total |
| Type | Counter |
| Unit | unitless |
Total leaked segments that eventually ended (post-leak closure). Segments closed after being flagged as leaks. Useful for distinguishing true leaks from slow operations.
| Property | Value |
|---|
| Prometheus | scribe_telemetry_segment_leaks_ended_after_total |
| Type | Counter |
| Unit | unitless |
Total restarts of internal services. Counts automatic recovery attempts. Frequent restarts indicate instability.
| Property | Value |
|---|
| Prometheus | scribe_service_restarts_total |
| Type | Counter |
| Unit | unitless |
| Dimensions | service |
Current service health status (1=up, 0=down). Use for availability dashboards and alerts. A value of 0 indicates the service is not running.
| Property | Value |
|---|
| Prometheus | scribe_service_up |
| Type | Gauge |
| Unit | unitless |
| Dimensions | service |
Total service state transitions. Counts state changes like starting, running, stopping. Filter by from and to dimensions.
| Property | Value |
|---|
| Prometheus | scribe_service_transitions_total |
| Type | Counter |
| Unit | unitless |
| Dimensions | service, from, to |
Duration of service state transitions. Time spent transitioning between states. Long transitions may indicate initialization issues.
| Property | Value |
|---|
| Prometheus | scribe_service_transition_duration_seconds |
| Type | Histogram |
| Unit | seconds |
| Dimensions | service, from, to |
Duration of store commit operations. Time to persist changes to the database. Spikes may indicate transaction contention.
| Property | Value |
|---|
| Prometheus | scribe_store_commit_duration_seconds |
| Type | Histogram |
| Unit | seconds |
| Dimensions | entry_type |
Time spent waiting for commit lock. Serialization delay when multiple writers contend. High values suggest commit throughput bottleneck.
| Property | Value |
|---|
| Prometheus | scribe_store_commit_wait_duration_seconds |
| Type | Histogram |
| Unit | seconds |
| Dimensions | entry_type |
These are standard OTel runtime metrics. See OTel Semantic Conventions for details.
| Property | Value |
|---|
| Prometheus | jvm_memory_used |
| Property | Value |
|---|
| Prometheus | jvm_memory_committed |
| Property | Value |
|---|
| Prometheus | jvm_memory_limit |
| Property | Value |
|---|
| Prometheus | jvm_memory_pressure |
| Property | Value |
|---|
| Prometheus | jvm_thread_count |
| Property | Value |
|---|
| Prometheus | jvm_cpu_count |
| Property | Value |
|---|
| Prometheus | process_uptime |