Change Detection

Scribe uses three mechanisms to detect directory changes: persistent search, polling, and reconciliation. Each trades speed for coverage. The fastest fires first; the others fill the gaps.

Change detection cascade

Scribe tries each mechanism in order — fastest first

Persistent Search Sub-second lag — real-time push from the directory

Fastest

Polling Seconds of lag — periodic queries for modified entries

Reconciliation Background safety net — full scan catches anything missed

Safety net

This page covers each mechanism, when it activates, and the knobs that affect sync lag.

For the high-level data flow, see Architecture. For tuning worker counts and queue sizes, see Transcribes configuration.

Persistent search

When the LDAP directory supports it (Active Directory via DirSync, OpenLDAP via syncrepl, or the persistent search control), Scribe registers for real-time change notifications. The directory pushes changes as they happen.

This is the fastest path. Lag is typically sub-second — bounded by network round-trip and PostgreSQL write latency. Scribe holds the connection open and reconnects automatically if it drops.

Not all directories support persistent search, and some only support it for certain subtrees. When persistent search isn’t available for a transcribe, Scribe falls back to polling.

Polling

Scribe periodically queries the directory for entries modified since the last check. The idle period between polls defaults to 5 seconds (ldap.idle-period). The effective interval is the search duration plus that idle period.

Lag equals the poll interval plus processing time. For a directory with moderate change rates (hundreds of changes per interval), processing adds negligible overhead. For high-rate sources (thousands of changes per poll), consider lowering the interval or increasing workers.

Polling uses modifyTimestamp or an equivalent attribute to detect changes. It misses entries that were modified and then reverted between polls — reconciliation catches those.

Reconciliation

Reconciliation is the safety net. It periodically walks the entire directory tree for a transcribe and compares each entry against the local copy. Anything that persistent search or polling missed gets picked up here.

Typical triggers for missed changes:

Network partition during a persistent search connection
Directory failover to a replica with slightly different state
Entries modified while Scribe was shut down
Clock skew causing modifyTimestamp queries to miss a window

When it runs

Reconciliation is pressure-aware. It waits for the system to be idle — low ingest queue pressure, no active tasks consuming significant resources — before starting a full scan. This prevents reconciliation from competing with real-time change processing during peak load.

After startup, Scribe runs an initial reconciliation to catch anything that changed while it was down. Scheduled runs happen only when you set ldap.reconciliation.interval or ldap.reconciliation.cron (there is no default interval).

ETag deduplication

Every entry has an attribute hash (ETag) computed from its current attribute values. During reconciliation, Scribe compares the directory entry’s ETag against the stored one. If they match, the entry is skipped — no event is written, no database update happens.

This makes reconciliation cheap even for directories with millions of entries. A full scan of a million-entry directory typically takes a few minutes and writes zero events if nothing changed.

What affects lag

The gap between a change happening in the directory and that change being queryable through Scribe depends on several factors:

Factor	Impact	Tuning
Persistent search support	Sub-second lag when available; falls back to poll interval otherwise	Directory-dependent — enable DirSync/syncrepl if possible
Poll interval	Directly adds to lag for polled transcribes	`ldap.idle-period`
Worker count	More workers process changes faster under load	`transcribes.<name>.transcription.workers`
PostgreSQL write throughput	Bottleneck under high change rates	Connection pool size, disk I/O, `VACUUM ANALYZE`
Network latency to LDAP	Adds to each poll and persistent search reconnect	Network topology

Monitor lag with scribe_ingest_lag_seconds per entry type. Sustained lag above 60 seconds warrants investigation — see Monitoring for the diagnostic workflow.