SLAPD-SQL(5) | File Formats Manual | SLAPD-SQL(5) |
slapd-sql - SQL backend to slapd
/etc/ldap/slapd.conf
The primary purpose of this slapd(8) backend is to PRESENT information stored in some RDBMS as an LDAP subtree without any programming (some SQL and maybe stored procedures can't be considered programming, anyway ;).
That is, for example, when you (some ISP) have account information you use in an RDBMS, and want to use modern solutions that expect such information in LDAP (to authenticate users, make email lookups etc.). Or you want to synchronize or distribute information between different sites/applications that use RDBMSes and/or LDAP. Or whatever else...
It is NOT designed as a general-purpose backend that uses RDBMS instead of BerkeleyDB (as the standard BDB backend does), though it can be used as such with several limitations. You can take a look at http://www.openldap.org/faq/index.cgi?file=378 (OpenLDAP FAQ-O-Matic/General LDAP FAQ/Directories vs. conventional databases) to find out more on this point.
The idea (detailed below) is to use some meta-information to translate LDAP queries to SQL queries, leaving relational schema untouched, so that old applications can continue using it without any modifications. This allows SQL and LDAP applications to inter-operate without replication, and exchange data as needed.
The SQL backend is designed to be tunable to virtually any relational schema without having to change source (through that meta-information mentioned). Also, it uses ODBC to connect to RDBMSes, and is highly configurable for SQL dialects RDBMSes may use, so it may be used for integration and distribution of data on different RDBMSes, OSes, hosts etc., in other words, in highly heterogeneous environment.
This backend is experimental.
These slapd.conf options apply to the SQL backend database, which means that they must follow a "database sql" line and come before any subsequent "backend" or "database" lines. Other database options not specific to this backend are described in the slapd.conf(5) manual page.
dbhost <hostname>
dbpasswd <password>
dbuser <username>
These options specify SQL query templates for scoping searches.
These options specify SQL query templates for loading schema mapping meta-information, adding and deleting entries to ldap_entries, etc. All these and subtree_cond should have the given default values. For the current value it is recommended to look at the sources, or in the log output when slapd starts with "-d 5" or greater. Note that the parameter number and order must not be changed.
These statements are used to modify the default behavior of the backend according to issues of the dialect of the RDBMS. The first options essentially refer to string and DN normalization when building filters. LDAP normalization is more than upper- (or lower-)casing everything; however, as a reasonable trade-off, for case-sensitive RDBMSes the backend can be instructed to uppercase strings and DNs by providing the upper_func directive. Some RDBMSes, to use functions on arbitrary data types, e.g. string constants, requires a cast, which is triggered by the upper_needs_cast directive. If required, a string cast function can be provided as well, by using the strcast_func directive. Finally, a custom string concatenation pattern may be required; it is provided by the concat_pattern directive.
fetch_attrs <attrlist>
fetch_all_attrs { NO | yes }
Almost everything mentioned later is illustrated in examples located in the servers/slapd/back-sql/rdbms_depend/ directory in the OpenLDAP source tree, and contains scripts for generating sample database for Oracle, MS SQL Server, mySQL and more (including PostgreSQL and IBM db2).
The first thing that one must arrange is what set of LDAP object classes can present your RDBMS information.
The easiest way is to create an objectClass for each entity you had in ER-diagram when designing your relational schema. Any relational schema, no matter how normalized it is, was designed after some model of your application's domain (for instance, accounts, services etc. in ISP), and is used in terms of its entities, not just tables of normalized schema. It means that for every attribute of every such instance there is an effective SQL query that loads its values.
Also you might want your object classes to conform to some of the standard schemas like inetOrgPerson etc.
Nevertheless, when you think it out, we must define a way to translate LDAP operation requests to (a series of) SQL queries. Let us deal with the SEARCH operation.
Example: Let's suppose that we store information about persons working in our organization in two tables:
PERSONS PHONES
---------- -------------
id integer id integer
first_name varchar pers_id integer references persons(id)
last_name varchar phone
middle_name varchar
...
(PHONES contains telephone numbers associated with persons). A person can have several numbers, then PHONES contains several records with corresponding pers_id, or no numbers (and no records in PHONES with such pers_id). An LDAP objectclass to present such information could look like this:
person
-------
MUST cn
MAY telephoneNumber $ firstName $ lastName
...
To fetch all values for cn attribute given person ID, we construct the query:
SELECT CONCAT(persons.first_name,' ',persons.last_name)
AS cn FROM persons WHERE persons.id=?
for telephoneNumber we can use:
SELECT phones.phone AS telephoneNumber FROM persons,phones
WHERE persons.id=phones.pers_id AND persons.id=?
If we wanted to service LDAP requests with filters like (telephoneNumber=123*), we would construct something like:
SELECT ... FROM persons,phones
WHERE persons.id=phones.pers_id
AND persons.id=?
AND phones.phone like '%1%2%3%'
(note how the telephoneNumber match is expanded in multiple wildcards to account for interspersed ininfluential chars like spaces, dashes and so; this occurs by design because telephoneNumber is defined after a specially recognized syntax). So, if we had information about what tables contain values for each attribute, how to join these tables and arrange these values, we could try to automatically generate such statements, and translate search filters to SQL WHERE clauses.
To store such information, we add three more tables to our schema and fill it with data (see samples):
ldap_oc_mappings (some columns are not listed for clarity)
---------------
id=1
name="person"
keytbl="persons"
keycol="id"
This table defines a mapping between objectclass (its name held in the "name" column), and a table that holds the primary key for corresponding entities. For instance, in our example, the person entity, which we are trying to present as "person" objectclass, resides in two tables (persons and phones), and is identified by the persons.id column (that we will call the primary key for this entity). Keytbl and keycol thus contain "persons" (name of the table), and "id" (name of the column).
ldap_attr_mappings (some columns are not listed for clarity)
-----------
id=1
oc_map_id=1
name="cn"
sel_expr="CONCAT(persons.first_name,' ',persons.last_name)"
from_tbls="persons"
join_where=NULL
************
id=<n>
oc_map_id=1
name="telephoneNumber"
sel_expr="phones.phone"
from_tbls="persons,phones"
join_where="phones.pers_id=persons.id"
This table defines mappings between LDAP attributes and SQL queries that load their values. Note that, unlike LDAP schema, these are not attribute types - the attribute "cn" for "person" objectclass can have its values in different tables than "cn" for some other objectclass, so attribute mappings depend on objectclass mappings (unlike attribute types in LDAP schema, which are indifferent to objectclasses). Thus, we have oc_map_id column with link to oc_mappings table.
Now we cut the SQL query that loads values for a given attribute into 3 parts. First goes into sel_expr column - this is the expression we had between SELECT and FROM keywords, which defines WHAT to load. Next is table list - text between FROM and WHERE keywords. It may contain aliases for convenience (see examples). The last is part of the where clause, which (if it exists at all) expresses the condition for joining the table containing values with the table containing the primary key (foreign key equality and such). If values are in the same table as the primary key, then this column is left NULL (as for cn attribute above).
Having this information in parts, we are able to not only construct queries that load attribute values by id of entry (for this we could store SQL query as a whole), but to construct queries that load id's of objects that correspond to a given search filter (or at least part of it). See below for examples.
ldap_entries
------------
id=1
dn=<dn you choose>
oc_map_id=...
parent=<parent record id>
keyval=<value of primary key>
This table defines mappings between DNs of entries in your LDAP tree, and values of primary keys for corresponding relational data. It has recursive structure (parent column references id column of the same table), which allows you to add any tree structure(s) to your flat relational data. Having id of objectclass mapping, we can determine table and column for primary key, and keyval stores value of it, thus defining the exact tuple corresponding to the LDAP entry with this DN.
Note that such design (see exact SQL table creation query) implies one important constraint - the key must be an integer. But all that I know about well-designed schemas makes me think that it's not very narrow ;) If anyone needs support for different types for keys - he may want to write a patch, and submit it to OpenLDAP ITS, then I'll include it.
Also, several users complained that they don't really need very structured trees, and they don't want to update one more table every time they add or delete an instance in the relational schema. Those people can use a view instead of a real table for ldap_entries, something like this (by Robin Elfrink):
CREATE VIEW ldap_entries (id, dn, oc_map_id, parent, keyval)
AS
SELECT 0, UPPER('o=MyCompany,c=NL'),
3, 0, 'baseObject' FROM unixusers WHERE userid='root'
UNION
SELECT (1000000000+userid),
UPPER(CONCAT(CONCAT('cn=',gecos),',o=MyCompany,c=NL')),
1, 0, userid FROM unixusers
UNION
SELECT (2000000000+groupnummer),
UPPER(CONCAT(CONCAT('cn=',groupnaam),',o=MyCompany,c=NL')),
2, 0, groupnummer FROM groups;
If your RDBMS does not support unions in views, only one objectClass can be mapped in ldap_entries, and the baseObject cannot be created; in this case, see the baseObject directive for a possible workaround.
Having meta-information loaded, the SQL backend uses these tables to determine a set of primary keys of candidates (depending on search scope and filter). It tries to do it for each objectclass registered in ldap_objclasses.
Example: for our query with filter (telephoneNumber=123*) we would get the following query generated (which loads candidate IDs)
SELECT ldap_entries.id,persons.id, 'person' AS objectClass,
ldap_entries.dn AS dn
FROM ldap_entries,persons,phones
WHERE persons.id=ldap_entries.keyval
AND ldap_entries.objclass=?
AND ldap_entries.parent=?
AND phones.pers_id=persons.id
AND (phones.phone LIKE '%1%2%3%')
(for ONELEVEL search) or "... AND dn=?" (for BASE search) or "... AND dn LIKE '%?'" (for SUBTREE)
Then, for each candidate, we load the requested attributes using per-attribute queries like
SELECT phones.phone AS telephoneNumber
FROM persons,phones
WHERE persons.id=? AND phones.pers_id=persons.id
Then, we use test_filter() from the frontend API to test the entry for a full LDAP search filter match (since we cannot effectively make sense of SYNTAX of corresponding LDAP schema attribute, we translate the filter into the most relaxed SQL condition to filter candidates), and send it to the user.
ADD, DELETE, MODIFY and MODRDN operations are also performed on per-attribute meta-information (add_proc etc.). In those fields one can specify an SQL statement or stored procedure call which can add, or delete given values of a given attribute, using the given entry keyval (see examples -- mostly PostgreSQL, ORACLE and MSSQL - since as of this writing there are no stored procs in MySQL).
We just add more columns to ldap_oc_mappings and ldap_attr_mappings, holding statements to execute (like create_proc, add_proc, del_proc etc.), and flags governing the order of parameters passed to those statements. Please see samples to find out what are the parameters passed, and other information on this matter - they are self-explanatory for those familiar with the concepts expressed above.
First of all, let's recall that among other major differences to the complete LDAP data model, the above illustrated concept does not directly support such features as multiple objectclasses per entry, and referrals. Fortunately, they are easy to adopt in this scheme. The SQL backend requires that one more table is added to the schema: ldap_entry_objectclasses(entry_id,oc_name).
That table contains any number of objectclass names that corresponding entries will possess, in addition to that mentioned in mapping. The SQL backend automatically adds attribute mapping for the "objectclass" attribute to each objectclass mapping that loads values from this table. So, you may, for instance, have a mapping for inetOrgPerson, and use it for queries for "person" objectclass...
Referrals used to be implemented in a loose manner by adding an extra table that allowed any entry to host a "ref" attribute, along with a "referral" extra objectClass in table ldap_entry_objclasses. In the current implementation, referrals are treated like any other user-defined schema, since "referral" is a structural objectclass. The suggested practice is to define a "referral" entry in ldap_oc_mappings, holding a naming attribute, e.g. "ou" or "cn", a "ref" attribute, containing the url; in case multiple referrals per entry are needed, a separate table for urls can be created, where urls are mapped to the respective entries. The use of the naming attribute usually requires to add an "extensibleObject" value to ldap_entry_objclasses.
As previously stated, this backend should not be considered a replacement of other data storage backends, but rather a gateway to existing RDBMS storages that need to be published in LDAP form.
The hasSubordintes operational attribute is honored by back-sql in search results and in compare operations; it is partially honored also in filtering. Owing to design limitations, a (brain-dead?) filter of the form (!(hasSubordinates=TRUE)) will give no results instead of returning all the leaf entries, because it actually expands into ... AND NOT (1=1). If you need to find all the leaf entries, please use (hasSubordinates=FALSE) instead.
A directoryString value of the form "__First___Last_" (where underscores mean spaces, ASCII 0x20 char) corresponds to its prettified counterpart "First_Last"; this is not currently honored by back-sql if non-prettified data is written via RDBMS; when non-prettified data is written through back-sql, the prettified values are actually used instead.
When the ldap_entry_objclasses table is empty, filters on the objectClass attribute erroneously result in no candidates. A workaround consists in adding at least one row to that table, no matter if valid or not.
The proxy cache overlay allows caching of LDAP search requests (queries) in a local database. See slapo-pcache(5) for details.
There are example SQL modules in the slapd/back-sql/rdbms_depend/ directory in the OpenLDAP source tree.
The sql backend honors access control semantics as indicated in slapd.access(5) (including the disclose access privilege when enabled at compile time).
2021/01/18 | OpenLDAP 2.4.57+dfsg-3+deb11u1 |