Efficiently Implementing Access Control

The specification of access control lists for the InternetWide Architecture is complex. Naive implementations may therefore be inefficient. But, lucky us, we are not naive.

The ACL definitions allow for many combinations to be tried. This combines influences from several factors, making it sound like a complex task. And, if we follow our plans where various service providers all need to know the same ACL information that may include part of our contact list, it starts to sound like bringing a lot of data out in the open — and making it end up in front-end servers that are close to the security perimeter of an organisation.

As a dramatically complicating factor, we want to auto-detect an alias for each attempt to contact to a user's alias-free address (and we even want to go so far to remove any provided alias, auto-detect the alias and verify that it is the same as the one provided) by finding the most suitable Communication ACL, as if these were not complex already. This may sound like a lot of work. And to make matters worse, there may in fact be multiple recipients for any communication. Can we control all that complexity?

It turns out that ACLs, even the Communication ACLs, can be implemented efficiently. The solution described below not only computes an ACL with just a few lookups in a simple key-value database; it even supports lookups from a database with encrypted content. As a result, even when a cracker gains access to the address processing front-end server and even when he downloads the database, he still cannot directly read the communication addresses of those involved!

Note: A companion document describes how we control group/role access.

Implementing Communication ACLs

Communication ACLs define the black and white lists for various remote addresses. Such ACLs are used to select remote peers that may, or may not, get in contact with the holder of the ACL. A separate ACL is created for each local address, even separately for the many aliases that a user may have.

We shall make one large key-value database that integrates ACLs for not just all the aliases of a user, but also for all users of a domain and all domains that use a service. This allows a multi-tenant service (such as a mail server for many domains) to query one database for its overall access policy.

The database maps <remote,local> communication pairs to the various aliases that they support, embellished with black/white/gray scores for each. Note that gray-listing calls for further evaluation of a remote with the intention to add them either on the white or black list. The manner in which graylisting is implemented remains protocol-specific, even though the resulting black or white listing is shared between protocols.

Values in Communication ACL Databases

The values found in the ACL database describe access rights as strings of space-separated words. The permitted word forms are:

@B@ to indicate that the following entries are black-listed
@W@ to indicate that the following entries are white-listed
@G@ to indicate that the following entries are gray-listed
for entries preceding the above, white-listing is applied
+ indicates unaltered acceptance of a service or user without alias
+alias to indicate an alias for the queried user
user+alias to indicate a complete local part for the queried user TODO: Mainly used for group/role, but we also need the alias.
user@domain.name to indicate a remote identity to substitute for the local identity. TODO: Mainly used for forwarding, which can be complex and is usually solved in a protocol-specific manner.

An example entry might describe access rights as

+cook +dancer @G@ +info @B@ +private @W@ ballet+redshoes

This describes whitelist entries for the aliases cook and dancer as well as for access to group or role ballet under member/occupant name redshoes. It also allows access to the info alias, but would require some form of gray-listing that is specific to the protocol at hand. Finally, access to the alias private will be rejected. (Note that this is an syntactical example; it may not be meaningful to have this combination in a real environment.)

When a local username was supplied without an alias, then any or all of the given aliases may be added. It is also possible to use the group membership ballet+redshoes for the local part, if that feels more useful (perhaps because the remote address is from the same group). If an alias was supplied for the contacted local address, then one may be selected from the list with a preference for white-listed entries, and dropping through gray-listed entries to black-listed ones. If an alias was supplied but it matches none of the listed options, then a selection may be made from the permitted ones, and some message could be sent back to the remote address to indicate a change of address.

Note that it is permitted to contact a local address without an alias, but that communication always involves a local alias. Such an alias can be silently substituted by the rules above. Only when a local alias is addressed and changed by these rules will feedback about the change of address be generated for the user. It may well take the form this user has moved to another address, please update your contact information style of messages, and it is quite possible to use tools like Bloom filters to avoid sending the same message over and over.

Whenever changes are made to an ACL, its corresponding access rights will be updated accordingly with communication software. Newly introduced aliases would usually be added to the end, so as to avoid abrupt changes to existing communication patterns. Entries that are both white-listed and black-listed are modified into gray-listed entries.

Keys in Communication ACL Databases

Now we know what we are looking for, we can turn to the keys that lead to the information. Clearly, we don't need to search for aliases, as these are found in the database and can be compared to the local address with which a remote attempts to communicate. We can simply look for the user@domain.name form, even if communication is attempted with user+alias@domain.name. This saves us a dimension of queries and it gives us more flexibility in automatically adding an alias to communications that enter without one.

Internally, we always communicate with a local alias, and most users would want to have an alias named default as a fallback. ACLs are attached to aliases, and default would likely capture all remotes, and more specific remote patterns would override the default behaviour.

The keys that we are looking for are conceptually represented as a tuple of addresses,

<remote_pattern,local_address>

Interpretation of the remote address can be based on standardised or broadly used structures in email addresses. Domain names for instance, can in general be generalised to their parent domains (which we match with a pattern like @.com) and we can either mention the username part (like user@example.com) or leave it out (like @example.com).

Even our favourite local identity trickery, where we use the + to separate user and alias, is widely used and most mail systems will accept john+cowboy@example.com for an account john@example.com — what sets our approach aside is not so much that we also use these extended addresses, but what value we assign to them and how we allow their management and separate ACLs and so on. But what this boils down to is that a remote address that has + in its local part can quite reasonably try the address without the alias too. So, john+cowboy would be matched with john+cowboy, then by john+ and finally by the empty string. Note that we do not match with john without the plus sign; we consider that as going too far in the interpretation of the address. The match against john+ means that any alias will do, but one must be present. TODO: This is a bit arbitrary, we may change this later.

All this is just saying that we read the local and remote address as a DoNAI and use a DoNAI Selector to perform upward iteration on the remote identity.

The local address in the conceptual pair is a complete address (though any alias is removed from the key) and the remote address is subject to iteration over possible DoNAI Selectors that could match. While doing this iteration from concrete to generic, the first database entry found is conclusive.

This use of a first find means that the most concrete entry in the ACL wins. This allows local overrides to be made on top of general patterns. It is supportive of having a default alias with @. as the remote address pattern setup for each user, with further rules about its handling, for example graylisting, and chances of routing it in the communication software based on the local identity with the +default alias.

Private Communication ACL Database Keys

The keys of a communication ACL database are stored in a hashed form. The purposes of this is to stop attackers from reading the contact information of many users. The act of hashing is quite straightforward and the benefits in terms of privacy can be pretty good when a service-hosting machine is successfully accessed by a cracker interested in a spam address list.

The hashing scheme is based on a keyed hash, such as DIGEST-SHA256. (Note that the DIGEST-H algorithms are deprecated for client-server authentication in fovour of the more interactive SCRAM series algorithms, but the purposes described here are not dynamic key agreement so much as repeatable key derivation procedures.) This introduces two levels of information; first a key is hashed, and when this is done the rest follows.

We shall assume that a context for the computation can be cloned, so a partial hash result can be easily continued in many ways. This helps to make such computations more efficient, while also being easy to implement. Where this is impossible, the computation may alternatively be started from scratch on every iteration.

Key Initiation

The initiation of the hash procedure consists of loading a database protection secret. This secret can be in any form, such as a hexadecimal string with the entropy of the hash in use, as this form will be hashed. The hash outcome is then loaded as the key of the keyed hash algorithm; We can now close off the key phase, which usually calls for a hash finalisation and inserts the result into a second phase of the keyed hash. It is the context data for this second phase that we will use with the communication peers, and for which it will be beneficial to fork the keyed hashing context.

We can now wipe the database protection key and its hash through zero-byte overwrites, and proceed with a somewhat less-protected interaction. In terms of services, this could mean dropping process privileges to a level where it cannot access the database protection key anymore. Or, if live updates of key material are desired, a separate process with access to the database protection key might communicate the prepared hashing context to a service that can stay online while keys are being swapped.

All this can be done without the database protection key ever being available in a program that is, at that moment, online. We believe this design allows for the best protection of privacy-sensitive information. In our project's future phase ServiceHub, we will probably share the database protection secret between a service and identity provider. A service's database may end up holding information for many identity providers, all mixed without a need of further distinction!

Key Priming for the Communication ACL

We now insert a text that is specific to this use of the scheme. For other uses of the scheme we shall use another text, so that there is no risk whatsoever of attacks that overlay data with a particular interpretation with, say, a cleverly constructed remote address.

The text starts with stating the application for which it is used,

00000000  43 4f 4d 4d 55 4e 49 43  41 54 49 4f 4e 20 41 43  |COMMUNICATION AC|
00000010  4c 20                                             |L |
00000012

It is then padded with at least one character x to make the second-level hash including the first-phase insertion into a complete hash block size. For SHA512, the blocks are 128 bytes and so we would end with

00000000  43 4f 4d 4d 55 4e 49 43  41 54 49 4f 4e 20 41 43  |COMMUNICATION AC|
00000010  4c 20 78 78 78 78 78 78  78 78 78 78 78 78 78 78  |L xxxxxxxxxxxxxx|
00000020  78 78 78 78 78 78 78 78  78 78 78 78 78 78 78 78  |xxxxxxxxxxxxxxxx|
00000030  78 78 78 78 78 78 78 78  78 78 78 78 78 78 78 78  |xxxxxxxxxxxxxxxx|
00000040  78 78 78 78 78 78 78 78  78 78 78 78 78 78 78 78  |xxxxxxxxxxxxxxxx|
00000050  78 78 78 78 78 78 78 78  78 78 78 78 78 78 78 78  |xxxxxxxxxxxxxxxx|
00000060  78 78 78 78 78 78 78 78  78 78 78 78 78 78 78 78  |xxxxxxxxxxxxxxxx|
00000070  78 78 78 78 78 78 78 78  78 78 78 78 78 78 78 78  |xxxxxxxxxxxxxxxx|
00000080

Once more, it may make sense to save the hashing context, so as to clone the results of this initial computation and thereby save time later on. The expense will be that a separate key must be cached (and found back) for each application of our database scheme.

The reason for the padding up to the block size (not the digest size) is to have no lingering work in the hash context, as that would be repeated. Just a bit of loop optimisation. There is also a mild security benefit from the scrambling process, but no security assumptions can really be made about that without studying the hash algorithm in detail.

Adding Local User

Given that the second phase of a keyed hash has been started by the insertion of a key and its use, we can now start adding the local contact which will not change as we iterate the DoNAI Selectors that would match the remote identity, ordered from conrete to generic.

In fact, we add the local contact (which is devoid of whitespace) and a space character. The local contact is normalised as described below. After this, we add the UTF-encoding for the space U+0020 which means that we add a byte 0x20 to the hash. The normalisation applied will strip away aliases, so we can look those up in the database (and learn about blacklisting and whitelisting).

We now have a keyed hash that we can clone once again for the various local values that we intend to hold against the ACL.

Adding Remote Selector

We now enter DoNAI Selector iteration. with the remote identity as a starting point, after it has been normalised as described below. Each of these is appended to the keyed hash in turn, then a key trailer and finally the hash computation is finished.

The key trailer is the following ASCII string (shown as a hexdump with bytes and ASCII data):

00000000  20 44 41 54 41 42 41 53  45 20 4b 45 59 20 45 4e  | DATABASE KEY EN|
00000010  43 52 59 50 54 49 4f 4e                           |CRYPTION|
00000018

The resulting hash value is the key to our communication ACL database. If so desired, a prefix of the hash value may alternatively be used, but care must be taken to avoid clashes. A wide consensus exists that 128 bit reduces the risk of a clash to unreliastically low values. A reason for preferring a prefix instead of the full hash value might be space efficiency. TODO: One final reason to use a prefix could be the use of a trailing part of the hash outcome as a key, which is a little bit simpler than what is proposed below, but also based on the assumptions that hash outcomes are large enough to avoid both the clash and a security key; as it happens however, this is how hashes are scaled, in an attempt to avoid birthday attacks. TODO: Are birthday attacks a risk in our use of hashes, or are they in fact useful to reduce the oracle function for a stolen database?

Iteration starts at the most concrete value, and ends at the most generic. The most concrete entry should win any conflict between mutual entries, so the search stops as soon as a key is found. At this point, we have a value to process.

When no entry matches at all, the communication attempt is not permitted.

Private Communition ACL Database Values

We shall add similar protection to the values found in the database. This can be achieved through encryption.

To derive such a key, we should fork the keyed hashing context once more, namely just before adding the key trailer. Starting from such a cloned value, we will add a value trailer, which is the following ASCII string (shown as a hexdump with bytes and ASCII data):

00000000  20 44 41 54 41 42 41 53  45 20 56 41 4c 55 45 20  | DATABASE VALUE |
00000010  45 4e 43 52 59 50 54 49  4f 4e                    |ENCRYPTION|
0000001a

After adding this, we finish the cloned hash computation once more. This yields the encryption key applied to the value. If the hash is longer, which it usually is when comparable strength algorithms are used, then a prefix of the hash may be used. Any symmetric form of encryption can be used, for instance a suitable AES scheme that uses the key sized like the input, so AES-256 when DIGEST-SHA512 is the keyed hash algorithm.

The keyed hash and encryption algorithms used to protect the Communication ACL Database are subject to change at a very slow pace, usually as the result of cryptographic developments that might hamper the privacy of the database. During transitions, key/value pairs may be inserted for the new data and after the completion of the transition these old key/value pairs may be removed. There does not seem to be a need to use separate key/value stores if both key and value may vary in size, but it is still a possible implementation of such algorithm rollovers.

The encryption algorithm is used in a mode that includes a form of block chaining, as well as an integrity code (MAC). As a first choice, we might think of AES-GCM. When AEAD forms are used, they incorporate the second-phase hash data as associated data (meaning that the database key will be included in the integrity code, not just the encrypted value data). Thanks to the MAC, it would even be possible to handle clashes, when a database can serve multiple values for the same key. This may offer yet another place to serve privacy in a future release, by shortening the number of bits stored for database keys.

The value as stored in the database may be preceded with a 32-bit code to identify the source. The identities would be locally assigned, and they can be used to remove all entries from the database that originated with a particular peer. This field is there to help with sudden breakages, but would not normally be of any use. When we move our projects into our third phase of ServiceHub we will need to deal with services hosting sites with input from a multitude of upstream providers, and we will then be glad to have added this field.

Resource Access ACL

The user defines a Resource ACL as a mapping of domain names or patterns to access rights like Read and Create. For each of the access rights, he specifies a list of DoNAI Selectors which permits the use of wild cards. Unlike for Communcation ACLs, the Resource ACL does not distinguish just a black list and white list, but instead a more subtle assignment of an access right that can range from ultimately powerful to humbly disempowered.

Resources are a bit more specific to the service at hand. In fact, we distinguish resources (which are a generic, static concept) and resource instances (which are individual occurrences of the generic concept, created anytime desired). Think of a software package as being tied to a resource, and domains or perhaps user accounts as its accessible data sections. The actual choices are made for individual resources. In general, something should define a resource if it is not a user and if it requires its own ACL. Instances are a second level that avoid one software bundle from claiming large numbers of identities to distinguish what it considers its instances. This is vague, because it is an abstract concept for the design of to-be-built (external) service implementations.

Where the Communication ACL ranges over all protocols in a similar manner, this is not the case with the Resource ACL. Resources for web and mail are different, even for the same user. Resource instances of two users are normally different, even when running on the same server.

Resources are 128-bit identities, which may be textually represented as a UUID. Resource instances are 128-bit identites, which too may be represented in textual form as a UUID, but it should be considered that a resource instance is always mentioned as a further refinement on top of a resource, so it is actually a 256-bit identity with a representation like two UUIDs with, say, a dash between them.

Values in Resource ACL Databases

Each value stored in a Resource ACL database is simply a string of the access rights assigned to the indexed resource, listed in any order by one character each. See the Filter-Id description at the RADIUS/Diameter API description. We list them in uppercase, surrounded by @ symbols. Even when a right implies other rights, we do specify them separately.

An example value could be

@WRPKOV@

to grant permissions for writing, reading, proving, knowing, owning and visiting a resource. Alternatively,

@V@

is the ultimate slayer: no rights other than visiting are endowed but nothing, not even a listing of not-to-be-viewed elements in the resource is provided. In some implementations, this may be presented as an authorisation error. When no database entry exists, an error like this is indeed the proper response.

The format is deliberately verbose (as is the case for the Communication ACL database values) to allow for easy future extensions.

To protect the values, we follow the same mechanism to protect the database values as described for the Communication ACL. The desired protection is not so much privacy (though that does no harm) but authenticity of the values. It could be beneficial to only sign, but privacy is not so bad to have and the overhead of coding and different operational procedures is probably more trouble than we need.

All the other encryption techniques match those for used for the Communication ACL, and will indeed be shared.

Keys in Resource ACL Databases

The lookup mechanism for a Resource ACL is similar to that of a Communication ACL. Once again, we use a keyed hash, and once again we start by hasing a database protection secret. The binary 128-bit form of the Resource is now attached to the key, because that is usually fixed in software and because it is good to avoid that keying information can migrate between code bases by keeping the keyed hash more open than strictly needed. Again, we may drop privileges at this point (or receive prepared hashing contexts from a more privileged process).

Then we insert a text that depends on whether only a resource will be inserted, or a resource followed by a resource instance. For a resource only, we identify the key use as

00000000  52 45 53 4f 55 52 43 45  20 41 43 4c 20           |RESOURCE ACL |
0000000d

and for a resource plus instance we identify the key use as

00000000  52 45 53 4f 55 52 43 45  20 49 4e 53 54 41 4e 43  |RESOURCE INSTANC|
00000010  45 20 41 43 4c 20                                 |E ACL |
00000016

We add padding in the same manner, so at least one and possibly more x characters until the first-phase hash plus the string ends on a number of bytes that is a multiple of the hash block size, so as to empty out the work to be done from the context; this benefits efficiency when we clone such hash contexts.

Now, to separate domains properly, we introduce the domain name, which MUST NOT have an internal space, in lowercase and without trailing dot, and we follow this canonical domain name with a space; for instance, Orvelte.NEP. would map to

0000???0  6f 72 76 65 6c 74 65 2e  6e 65 70 20              |orvelte.nep |
0000???c

We already incorporated the resource with the key, but when a resource instance is also used, we need to incorporate it still, which we do now. This is a byte string, with textual and binary forms both acceptable. We do need to distinguish the string from what follows it, so we insert a 16-bit length of the string (and bail out in error for byte strings longer than 16383 bytes) and write it in network byte order (high before low). Following its length, the byte string itself is inserted. Language-specific trailers such as a NUL character in C-style character strings are not included; there is a value for 00 bytes in binary strings, but not as a result of language influence on textual strings.

We do not pad again at this point, because the hash buffer does not require the security of having no stored data left in the buffer. We have however arrived at another point where cloning of the hash value is useful, but by now it scales to much lower numbers of iterations, possibly it will even be limited to one.

We now add the authorisation identity in canonical form, assuming that it is accessible to the current authenticated user, to the hash. We add the same key trailer as for Communication ACL lookup keys and finish the hash computation and use the result as a lookup key in the database. We enumerate the DoNAI Selectors that would match the authorisation identity, going from concrete to abstract, as defined for DoNAI Selectors, and let the most concrete match win. When no entry is found, we return an error, most likely to be interpreted as an authorisation error.

Note that the use of different prefixes for the two forms (resource with and without instance) means that clashes are assumed to be non-existent in practice. As a result, the same key-value database can hold both of these tables integrated into one table. A similar reasoning applies to the Communication ACL, which may also be inserted in this same table.

Privacy for Resource ACL Database Values

The same mechanism as used for privacy in Communication ACLs is also used for privacy of the values of Resource ACLs. Again, the data is not terminated with the key trailer but the same value trailer as used in the Communication ACL.

The distinction between the two kinds of ACL, even when they are merged into the same database, is made through the initial portion of what is being hashed, and so the trailer can be the same.

The information in the value for a Resource ACL are not very interesting, except perhaps that an entry is made for a Resource ACL, but the encryption does protect against rogue overwrites by parties not in posession of the key. Furthermore, in case of clashes, the MAC woven into the encryption scheme may in some future version be helpful to distinguish multiple values for the same key hash.

Finally, the entries are preceded with the same unencrypted 32-bit field as are used for the Communication ACL entries. Most often, the same values will be shared, but the precise meaning is locally defined. A pattern of use that we expect is that this field identifies an identity provider along with a version of their data, to allow inserting bulk updates to entries and the bulk removal of outdated entries from a prior version. The use of such bulk operations is not for everyday additional and removal of individual users or even whole domains, but rather to change algorithms or do other major maintenance overhauls.

Address Normalisation

The aforementioned procedures handle identities that could have come from a variety of sources, in a variety of encodings and with a variety of frivolous pieces added. So, before using addresses they should be normalised.

The process starts with an authorisation identity which is not necessarily the login identifier. Most protocols and authentication mechanisms allow the manual specification of an identity that should be assumed after the process, the so-called authorisation identity. Access Control Lists, whose implementation is described herein, are matched against those voluntarily "reduced" identities. (In other cases, there may be no such distinction; not all protocols work with authenticated identities.)
The identity will use the Unicode character set, encoded as UTF-8. If needed, a transport encoding is removed and a character set may be altered. Part of the removal of transport encoding is also to validate any input formats such as UTF-8 and UTF-16 to be properly represented, without syntax errors. The eventual UTF-8 encoding must be the most compact possible, so no nonsense like 4-byte zero character codes are allowed (the input may be corrected or rejected when it contains longer, but syntactically correct UTF-8 encoding).
This step only applies to local identities. Any dynamic portions are removed. At the present time, we recognise such portions by their trailing + sign in the local part, usually something like ...+stat+DYN+. The dynamic DYN part is not processed during the evaluation of ACLs, but the stat part may help to identify that it is being used. We therefore reduce this form to ...+stat++ where the trailing ++ in the literal form helps to identify the dynamic part that was taken out.
Any domain name label that includes punycode will now be mapped to the Unicode character set, with UTF-8 encoding. This is only done for domain name labels that have strictly ASCII forms.
The entire name is mapped to lowercase, inasfar as this has meaning to the code points (characters). We even do this for remote address local parts, even though protocols such as email declare their case-sensitivity to be local policy. In practice we don't see case-sensitive local parts, we do see trouble caused by being true to this form and we do believe it safe to assume that a domain is seriously troubled if it allows end users to register as Admin if a lot of power is assigned to an admin user.
We now apply SASLprep to the address to further standardise it. (TODO: Or earlier? Or later?)
This step only applies to local identities. Note that services are represented with what might be called a "null user name", by starting with a + sign, as in +contact+pgp and this form will not be reduced any further. Even though the addresses might hold a parameter to aid processing, that would then follow the notation for dynamic trailers, which should at this point have a trailing ++ due to the removal of dynamic (and impossible to match) address parts.
This step only applies to local identities. For the usual identity forms defined for users, the ACL lookup processes call for removal of aliases from the local parts. Longer-than-usual local parts like john+sales+bulk are then reduced to just john, not john+sales or anything else. There are two exceptions to this, namely services (starting with a + symbol) and pruned dynamic forms (ending with the ++ symbols). When either (or both) of these exceptions applies, the local part will not be simplified when the procedures call for the removal of aliases.

Efficiency versus Conceptual Complexity

The design presented above is very efficient. We expect to do 1-4 database lookups, and finish one hash function for each. For the entry found, we finish one more hash function and perform a symmetric decryption operation with the key that this derives.

Given this, we can facilitate a rich and complex conceptual model:

We can store communication ACL patterns for all domains, even for different IdentityHub providers (each using their own key) all in the same database.
The database has a high degree of protection from attackers that find their way into the service at hand; tests may be made about its entries, but the entries cannot be listed or enumerated.
Given a local and remote address, we can determine access in a few lookups.
The lookups can distinguish between impossible, white-listed, black-listed and gray-listed entries.
The lookups can derive aliases, or interact with them.
The lookups can automatically select an alias to use.
The lookups can automatically change the proper alias to use.

Security Achievements

The database is protected with a keying scheme that is meant to protect it from prying eyes. Precisely who is capable of what?

To know what key to lookup, as is the case during normal processing of requests to a Communication ACL or Resource ACL, the following information is needed:

The database protection key for the IdentityHub provider
For a Communication ACL, the communication data (from/to address)
For a Resource ACL, the UUID for the resource (and instance) and the targeted address

The address information does not add very much entropy. UUIDs are 128-bit long, but for resources they are typically fixed for a service. For resource instances, there often is a database to map from a use case to a value that may be looked up somewhere; when that lookup is not encrypted, the entropy added is far below 128 bit.

On top of that, address information can often be observed by passive observation of public (or unencrypted) traffic. The resources will often be found in open source code. Only the resource instances may be more diverse.

Because of this, we find that:

To guess what key to lookup, one may iterate over address and resource information; this may be a rather large search space, but it does not pose a challenge of a cryptographic level, at least not for common user naming patterns. The one thing stopping iteration therefore is:

The database protection key for the IdentityHub provider

Note: We are assuming that the prepared keys, which are in fact as potent as the database protection key, are not easy to get to for an attacker, because they are not stored on any device but rather retained in memory. It would be possible in theory to use PKCS #11 for further protection, incorporating as a secret the prepared database protection key, but this is likely to become a burden in terms of efficiency:

it only really adds value when the API reaches out to another process (with different rights)
the API blocks the optimisations resulting from cloning of the hashing context
the API supports incorporation of a secret into the hash, and it supports searching for it, but the search is often inefficient
caching these inefficient search results runs into the problem that updates are missed (though a failure could be triggered after a key object has been removed, thus triggering a new search, which complicates code structure)

This is why we simply advise to keep the database protection key away from software that touches remote network connections. The design deliberately uses a two-phase approach with a keyed hash, so this separation is possible. All that is required for this to work, is a possibility of communicating prepared keys.

When this is taken care of, the only party capable of iteration through guessing would be the party that has access to the database protection keys, which under the given specifications would be a privileged process. Without privilege escalation, it should be impossible for a remote cracker to get this far. Operating systems and common operational practices as well as coding styles for services help to avoid this escalation. Think of using chroot() and dropping user identity. The proposed complete preparation of the database protection keys in another process is even more potent.

The combined use of one database for all the lookups that are described herein, as well as potential other lookups (such as a map from groups to group members) help to obfuscate the information.

Finally, proper key management involves key rolling support. This is indeed built in. An IdentityHub may start preparing for a new key by publishing <key,value> pairs, then switch the default key used in the services and then clean up the <key,value> pairs made with the older key. As long as these are provided to the service in a strict sequence without missing intermediate values, this will cause no interruption of the service. We have adopted LDAP for these reasons. The handling of LDAP would be a typical task taken care of by a process that is separate from the service; the LDAP task would need write privileges on the database whereas the service only needs to read it. Note how this separation helps to stop denial of service through random removal of <key,value> pairs by an attacker.

So, we end up with a separation of tasks in three layers:

Database protection key preparation
LDAP subscription, database updates, key changes, dom2key mapping
Service with readonly access to database and dom2key mapping

The dom2key mapping is where the default key to use is indicated. This mapping is probably best modelled as a hash table, where clashes between a hash are only problematic when they refer to different keys — meaning, there is some room for a situationally clever approach to this hash table.

ARPA2 Work Package

The following tasks are needed to implement this facility in line with the IdentityHub infrastructure:

DoNAI and Selector handling, plus database access. Multiple languages to support a large number of service applications. Initial targets are C and Python; another language of interest would be Go.
Software to generate the <key,value> tuples for the databases in an unencrypted form, stored in LDAP. Services share these nodes from an alias starting under an object that describes the key to use.
Software (based on LillyDAP) to encrypt these <key,value> tuples while in transit to a particular service. This includes updates sent in response to a SyncRepl subscription, as that allows services to immediately update their databases when the IdentityHub is configured differently. Encryption uses the key mentioned in a containing object (and when none is found, the object is blocked).
Software (namely a backend for our Pulley client) to deliver LDAP-subscribed encrypted <key,value> pairs into a local database, where it can be used by the libraries.
Implementations of services based on this principle. At least of interest are a Milter that can be plugged into several brands of mail servers to relay messages as we do under the InternetWide Architecture, and a similar concept for the Reservoir.