Excerpts from Oliver Loch's message of 2012-05-24 09:01:50 -0700:
> So given that the multi master synchronization is working, and the
> time sync works too, will I run into database problems with the KDC
> services? Is all the information stored in the DIT and can one of the
> KDCs get into trouble because the data in the tree doesn't match the
> one in it's cache (as far as there is one)? That's the main thing I'm
> concerned about.
AFAIK, there's no cache, so you shouldn't have to worry about this.
> All I want is that I do not have to deal with anything when failover
> is happening. Like a broken Database on one of the KDCs or conflicts
> because they both write to the DIT (write locks), or one of the KDCs
> crashes because the data returned from the DIT is not the one it
> expected or something like that.
We use a multimaster openldap setup as a Kerberos backend at my site for
this reason. It works well for us.
Our setup is a little bit different than the one described by another
poster in this thread. For various reasons, we decided to make each KDC
talk only to a slapd running on the same host as itself, as in:
| (this is implemented with a
| _kerberos._udp SRV record
| that names all three servers)
| | |
[ kdc on server 1 ] [ kdc on server 2 ] [ kdc on server 3 ]
| | |
| | | (each KDC
| | | connects to a
| | | slapd on localhost)
| | |
[ slapd on server 1 ]-[ slapd on server 2 ]-[ slapd on server 3 ]
(implemented with MMR and delta-syncrepl)
(with apologies for the tortured ASCII diagram)
On the openldap side, we use delta-syncrepl instead of standard syncrepl
to implement MMR. As mentioned earlier, standard syncrepl works on the
object level. We weren't comfortable with object-level granularity for
change replication, since we could think of not-entirely-implausible
situations in which it might not do the right thing from a KDC database
perspective; e.g., a password change and a failed login count increment
occurring on the same principal on different KDCs at very nearly the
same time potentially yielding a database in which the failed login
count is incremented and the password unchanged. delta-syncrepl
maintains a separate changelog-style database that records modifications
to attributes within an LDAP record, and does a better job than
normal/object syncrepl of handling those edge cases.
Each of the KDC servers runs both kadmin and kdc daemons. The MMR
configuration seems to handle uncoordinated writes from multiple
distinct kadmins to the underlying LDAP database without any trouble.
That said, for various reasons we preferred to have a single DNS name
associated with kadmin, for client requests at any one time to go to a
single kadmin, and a predictable failover order in the event of a
failure. We implemented that on the network level with a load balancer
that presents a VIP to clients and routes their traffic to a working
kadmin (seamlessly failing between them as necessary to account for
service interruptions). The DNS name for the kadmin service is, of
course, associated with the VIP.
The end result provides a fault-tolerant authentication service (since
a client will be able to tolerate a broken KDC by talking to one of the
other KDCs), and also provides fault-tolerant kpasswd and kadmin
services (if one of the KDCs breaks, the network load balancer will
detect the failure and re-route traffic as/if necessary), which satisfies
our use case. You can get the former without using the ldap backend
through kpropd/kiprop/etc, but the ldap backend is the best way that I'm
aware of to get the latter.