Skip to content

Change Detection

Scribe uses three mechanisms to detect directory changes: persistent search, polling, and reconciliation. Each trades speed for coverage. The fastest fires first; the others fill the gaps.

Change detection cascade

Scribe tries each mechanism in order — fastest first

Persistent Search Sub-second lag — real-time push from the directory
Fastest
Polling Seconds of lag — periodic queries for modified entries
Reconciliation Background safety net — full scan catches anything missed
Safety net

This page covers each mechanism, when it activates, and the knobs that affect sync lag.

For the high-level data flow, see Architecture. For tuning worker counts and queue sizes, see Transcribes configuration.

When the LDAP directory supports it (Active Directory via DirSync, OpenLDAP via syncrepl, or the persistent search control), Scribe registers for real-time change notifications. The directory pushes changes as they happen.

This is the fastest path. Lag is typically sub-second — bounded by network round-trip and PostgreSQL write latency. Scribe holds the connection open and reconnects automatically if it drops.

Not all directories support persistent search, and some only support it for certain subtrees. When persistent search isn’t available for a transcribe, Scribe falls back to polling.

Scribe periodically queries the directory for entries modified since the last check. The idle period between polls defaults to 5 seconds (ldap.idle-period). The effective interval is the search duration plus that idle period.

Lag equals the poll interval plus processing time. For a directory with moderate change rates (hundreds of changes per interval), processing adds negligible overhead. For high-rate sources (thousands of changes per poll), consider lowering the interval or increasing workers.

Polling uses modifyTimestamp or an equivalent attribute to detect changes. It misses entries that were modified and then reverted between polls — reconciliation catches those.

Reconciliation is the safety net. It periodically walks the entire directory tree for a transcribe and compares each entry against the local copy. Anything that persistent search or polling missed gets picked up here.

Typical triggers for missed changes:

  • Network partition during a persistent search connection
  • Directory failover to a replica with slightly different state
  • Entries modified while Scribe was shut down
  • Clock skew causing modifyTimestamp queries to miss a window

Reconciliation is pressure-aware. It waits for the system to be idle — low ingest queue pressure, no active tasks consuming significant resources — before starting a full scan. This prevents reconciliation from competing with real-time change processing during peak load.

After startup, Scribe runs an initial reconciliation to catch anything that changed while it was down. Scheduled runs happen only when you set ldap.reconciliation.interval or ldap.reconciliation.cron (there is no default interval).

Every entry has an attribute hash (ETag) computed from its current attribute values. During reconciliation, Scribe compares the directory entry’s ETag against the stored one. If they match, the entry is skipped — no event is written, no database update happens.

This makes reconciliation cheap even for directories with millions of entries. A full scan of a million-entry directory typically takes a few minutes and writes zero events if nothing changed.

The gap between a change happening in the directory and that change being queryable through Scribe depends on several factors:

FactorImpactTuning
Persistent search supportSub-second lag when available; falls back to poll interval otherwiseDirectory-dependent — enable DirSync/syncrepl if possible
Poll intervalDirectly adds to lag for polled transcribesldap.idle-period
Worker countMore workers process changes faster under loadtranscribes.<name>.transcription.workers
PostgreSQL write throughputBottleneck under high change ratesConnection pool size, disk I/O, VACUUM ANALYZE
Network latency to LDAPAdds to each poll and persistent search reconnectNetwork topology

Monitor lag with scribe_ingest_lag_seconds per entry type. Sustained lag above 60 seconds warrants investigation — see Monitoring for the diagnostic workflow.