Skip to content

Architecture

Your LDAP directories hold identity data. Your apps speak different protocols. IdentityScribe sits between them: it syncs from LDAP into PostgreSQL, then serves that data through LDAP, REST, GraphQL, gRPC, or MCP — whichever your app needs.

System Overview

From LDAP sources to multi-protocol access

LDAP Sources
Active Directory
eDirectory
OpenLDAP
Other LDAP
IdentityScribe Ingest → PostgreSQL → Query

Scribe watches your LDAP sources for changes. It uses persistent search (real-time push from the directory) where supported, and falls back to polling (periodic queries for entries modified since the last check). A background reconciliation process periodically verifies that nothing was missed.

Each detected change runs through the transcription pipeline: detect the change type (add, modify, move, or delete), apply attribute mappings, and write an immutable event to PostgreSQL. The entry’s current state is updated in the entries table.

If an entry’s attribute hash (its ETag) hasn’t changed, Scribe skips it. This keeps reconciliation cheap even for directories with millions of entries.

Changes are recorded as immutable events:

Immutable Event Log

Every change recorded, any point in time queryable

09:14
+
User created
09:22
Δ
Email changed
09:35
Δ
Group added
09:41
Account disabled
now
?
Time-travel query
John Doe
email
groups
status
Add Modify Delete Query

Nothing is ever updated or deleted — every change is a new event. This is what makes point-in-time queries and full audit history possible. See Data Model for details.

Query Pipeline

Unified processing across all channels

Request
1
Normalize
  • Parse query
  • Validate syntax
  • Map attributes
2
Plan
  • Analyze filter
  • Choose indexes
  • Optimize access
3
Compile
  • Generate SQL
  • Bind parameters
  • Prepare statement
4
Execute
  • Run query
  • Stream results
  • Format response
Response

Every query — regardless of channel — passes through the same four-stage pipeline:

  1. Normalize — Parse the incoming query, validate syntax, resolve attribute aliases to canonical names.
  2. Plan — Analyze the filter, choose which PostgreSQL indexes to use, decide the access strategy.
  3. Compile — Generate the SQL statement, bind parameters, prepare for execution.
  4. Execute — Run the query against PostgreSQL, stream results back, format for the requesting channel.

An LDAP search and a GraphQL query for the same data produce identical results. The channel determines the wire format; the query engine handles everything else.

A transcribe tells Scribe what to sync from an LDAP source. You define one per entry type — users, groups, devices, whatever your directory holds. Each transcribe specifies a base DN, filter, scope, attribute mappings, and indexes.

Scribe processes transcribes alphabetically. Each runs its own change detection loop. You can enable or disable individual transcribes without restarting. See Data Model for concepts and Transcribes configuration for the full reference.

Every change Scribe detects produces an immutable event — add, modify, move, or delete. Events are append-only. Each has a unique position and timestamp. The entries table is a materialized view of the latest state, derived from events. This is what makes audit history and point-in-time queries possible.

See Data Model for event types, data formats, and examples.

Some attributes are computed at query time rather than stored. The classic example is memberOf — instead of storing group membership on every user, Scribe runs a subquery at read time to find all groups where the user appears as a member.

See Data Model for configuration examples.

A channel is a protocol endpoint. All channels share the same query engine, so results are consistent regardless of how you query.

ChannelProtocolTypical consumer
LDAPLDAP v3Legacy apps, directory tools
RESTHTTP/JSONMicroservices, automation
GraphQLHTTP/GraphQLFrontends, flexible queries
gRPCHTTP/2 + ProtobufService mesh, high-throughput
MCPModel Context ProtocolAI coding assistants

Each channel has its own auth rules, size limits, and protocol-specific features, but the query path is identical.

Queries accept filters in four formats. Scribe auto-detects which one you’re using, so you pick whatever fits your context:

FormatExampleGood for
FleXcn = JohnConfig files, interactive use
JSON{"cn": "John"}SDKs, programmatic queries
SCIMcn eq "John"SCIM-aware systems
LDAP(cn=John)LDAP clients, existing queries

All four support the same operations: equality, prefix, contains, and boolean logic. FleX and JSON also support attribute groups — searching multiple fields with a single condition (e.g., cn|mail = john matches against both cn and mail).

Filters understand temporal references too. Instead of computing timestamps, you can write modifyTimestamp ge now-1h or createTimestamp ge last 7 days in any format.

See Filters for the full syntax reference.

Scribe uses three mechanisms to stay in sync, each covering a different failure mode:

  • Persistent search — real-time push notifications from the LDAP server. Lowest latency. Used when the directory supports it.
  • Polling — periodic queries for entries modified since the last check. Fallback when persistent search isn’t available.
  • Reconciliation — background verification of all entries. Catches anything missed by the other two: network glitches, directory failovers, entries modified while Scribe was down.