Implementing Groups and Roles for Services

The specification of group and role access for the InternetWide Architecture is complex. This specification describes an implementation mechanism that translates to linear computational complexity, thus yielding the benefits without costing too much.

The definitions of groups and roles are very similar, except for some of their implementation characteristics. Groups tend to have resources allocated while roles do not; roles will mostly be used for access control. We shall speak of groups below, but the same applies to roles; so when we speak of group members we implicitly also talk of role occupants.

Whether someone is entitled to communicate with a group at all is determined in a Communication ACL, with the group@example.com address as the local address (or the target of communication). Within the group, there will be more refined access controls for each member.

But there is more. Group members have their own member name, and are uniquely identified within a group as group+member@example.com which is the only address that we intend to reveal to other group members. Tied to this member address is an alias to the user address or, if it is an external member, a complete remote address.

Group definitions are independent of any protocol. This means that they apply just as well for email as for chat, telephony and the Reservoir service. The precise interpretation of group semantics is likely to vary among service providers. In general, the intention is to have on-the-fly group facilitation over all services by merely defining the group. Groups and roles are likely to be interpreted differently — for instance, calling a group may setup an instant conference call, while calling a role might chase until it finds one available role occupant. This distinction between groups and roles is a concern for the service semantics however; all that the InternetWide Architecture aims to do is facilitate the possible distinction.

Note: A more general spec describes how we handle the Communication ACL and Resource ACL, referenced below. It also provides the details for the hashing scheme to find the database index key, and continued use of that computation to derive a decryption key for the value found.

Push and Pull Group Services

TODO: Rename Push Services to Communication Services and Pull Services to Resource Services, since that is how service types coincide with ACL types, and consistently.

We generally distinguish two kinds of group service, namely push services such as email that actively seek out members, and pull services such as a conference environment that sit and wait until members come to them. Mixtures of these modes can be imagined, but the discussion below treats them as separate and leaves the mixing to the service at hand. Once more, we aim to facilitate structures that we think will be useful, without prescribing them in more detail than needed.

Pull services may be configured as simply as allowing access to the service. There may be no need for group member lists, because no active searches are performed. The Communication ACL may already suffice to decide on access to the group, though an additional Resource ACL, using a resource UUID for the service at hand and a resource instance UUID for the group, can provide additional information on who has what privileges for the services of this group. In addition, a mapping between user/alias and their member alias would be needed so their registered address can be concealed by posing under a group member identity.

Push services need a list of members to communicate to. For each of the members, there will be a delivery address, usually a user+alias under the same domain, but also possible is an external address. Members are assigned access rights to the group resource. The need to see the list of members may lead to a privacy notification when a user tries to add it as a plugin service, once we start supporting those.

Group Addressing with Filtering Conditions

The Communication ACL only checks on the beginning of a local part. So, when directly addressing a group member cook+johann@example.com, the Communication ACL is only checked for access to cook@example.com. The actual use of the group will assign meaning to cook+johann by looking up that member name. This means that individual group members can be addressed through the group mechanism. As explained before, addresses of users that happen to be group members will be replaced by their group member name, so a call that passes through the group to cook+johann@example.com may appear to have come from cook+mary@example.com, regardless of the original address from which it was sent; even external addresses undergo this treatment, provided that the external address is registered with the group.

Email to the group cook@example.com would reach to all subscribing listeners, which may or may not include all member addresses; some members may be silently available. A good example could be an archiver that is not normally active but that may be triggered if so requested. We call this the non-subscribing archiver, and could address it directly under a member address such as cook+nsa@example.com. As long as explicitly mentioned in continuing traffic, it will archive messages as they pass by, but as soon as it is dropped as a recipient it will no longer be kept up to date.

Group addresses can have pretty frivolous syntax. They may specify multiple members with cook+john+mary in the local part, and it is even possible to explicitly remove members (such as someone whose birthday present needs to be discussed) with cook+-+john or even cook+-+john+mary to remove the two addresses cook+john and cook+mary from the delivery list of any messaging. Note, once again, that the interpretation of this depends on the service at hand. It might be that an attempt by cook+john to call into a conference is blocked, for instance. TODO: Or do we want to allow the dash - as a possible member name, and use ++ instead? Pretty arbitrary choice.

The task of group handling is to collect the addresses that are involved, and ensure that these all get their service as a group. Where possible, this would also be the place to remove duplicate group+member identities, so as to avoid pushing out the service to the same member more than once. A well-known place where this can occur is in an email list, where replies are sent to both the group and the originator, the following reply goes to the group, the originator and the replier, and so on. There is no added value in seeing all these copies, and the group logic is where this should be avoided, if possible. For roles, overlap might indicate a preference and lead to different handling.

The method by which duplicates are avoided is surprisingly simple; even a push service does not unfold the group address into member addresses and then pick out doubles; it iterates over the complete group and uses the group address(es) as filtering conditions. This is not just a straightforward mechanism for avoiding double addresses, but it also provides good semantic equivalence with pull service. One advantage of this is that we could add frivolity group addresses without worrying about delivery constraints in push services; we could use an advanced notation involving set logic if we wanted!

A result of using group addresses as filtering conditions is that it is possible that nothing matches. One might write to a group+member that is not subscribed at that time. We should formally check the right to check group membership before we decide between reporting non-delivery or silently swallowing a delivery to an empty set.

Sender Privacy

The address of a sender to a group is somewhat private. Assuming that a member alias has been setup, the (local or remote) sender has their address replaced by a group+member@example.com address. This gives the group more control over access rights and routing, while the member does not make its address public by posting content.

Privacy in terms of groups is normally related to the sender. This is because the recipients are usually addressed by their group+member alias under the group's domain, as this is how they are known to the group; they send to the group with that local part in the address, and services such as XMPP or IRC would simply use the member part as a nick name within the discussion group. The assumption here is that delivery will be to each member's delivery address without showing that to the sender or other recipients; if that does not apply to a service then it should also work to protect the privacy of any extra addresses shown.

Sender privacy is not implemented as part of the group mechanism, but provided through the general mechanism of aliasing in the Communication ACL code. The special form group+member mentioned there sets the alias to be used, and by replacing the user name it also indicates that a group or role is being suggested.

Group members may want to use a different descriptive name than their default name. If this is desired, it should be setup in the user+alias that is linked to the group+member address. When selecting an alias, the descriptive name can be overwritten if so desired. TODO: Not currently defined: a database mapping KH[user+alias] KHE[descriptions_and_preferences]. TODO: How to lookup alias info when we immediately jump to the group member?

The mapping from a member name +member to a delivery address for the use is part of the group mechanism, at least for push services.

It is worth noting that (email) bounces are sent back to the sender to a group, rather than to a general handling address. When this happens, sender privacy must still be observed. TODO: Use regexps to protect delivery addresses mentioned inline in a bounce text? The bounce may be filtered out when it is targeted at someone without the right to check group membership.

Resource ACL for Group Management

Group management ACLs are in fact two different ACLs, each serving their own purpose:

The management of group members (adding and removing [other] members)
The management of group data (access to group-shared data)

These are kept as two independent Resource ACLs. Each uses a fixed Resource UUID and follow that by an instance UUID for the group at hand. The two group management ACLs share the same instance UUID for the same group.

TODO: Allocation of Resource [Instance] UUID values. Who, how, when? (who is IdentityHub)(when is group creation)(why is for resource access)

Access Rights for Group Memberships

The first level of Resource ACL for groups is concerned with membership management. This may be a task assigned to an operator, to each current member or to anyone (such as for a public group). We assign the following interpretation to the various access rights:

A is for IdentityHub operation including editing Group ACLs
S is for exceptional plugin services granted group administration rights
D is the right to remove other members
C is the right to add members (self or other)
W is not used (yet)
R is for reading group membership lists (push services)
P is not used (yet)
K for checking if a group member name exists
O for editing and removing one's own records
V for unregistered users (note: both local and external users can be registered)

Groups assign management when they are created, which can be set to be the one creating the list (probably a good default) or anyone in the group at the time an action is taken. Note that the latter may include external members.

The default group setup will be setup in a particular way, with ACLs to enact that rights are assigned as follows, with all the following rights assigned as well:

A is authorised to the IdentityHub (+arpa2+idhub@example.com)
S is authorised to nobody, but exceptional plugin services might land here
D is authorised to group management
C is authorised to group management
W is authorised along with C
R is authorised to push services (xmpp.example.com)
P is authorised along with R
K is authorised to all group members (group+@example.com)
O is authorised to all group members (group+@example.com)
V is authorised to everyone (@.)

Note that for public lists, there are a few possible changes:

C consists of anyone, namely @.

Access Rights for Group Data

The second level of Resource ACL for groups is concerned with the resources that can be shared between members. For an email list this would be the emails; for Reservoir it would be documents/objects shared, for telephony it might be a conference session. The interpretation of the access rights is specific for the service that handles it, but the automatic application of groups over a variety of services requires a general outline:

A is for IdentityHub operation
S is for exceptional plugin services granted group administration rights
D is the right to remove shared resources
C is the right to add shared resources
W is the right to write a new version of a shared resource
R is the right to read a shared resource
P is the right to ask unprivileged questions to supportive types of resource
K is the right to see the shared resource and its metadata
O is the right to accept and pass ownership of a resource
V is the right to review public portions of shared resources

By default, groups are setup as follows, where rights automatically imply the rights below them:

A is authorised to the IdentityHub (+arpa2+idhub@example.com)
S is authorised to nobody, but exceptional plugin services might land here
D is authorised to the group (group+@example.com)
C is authorised to the group (group+@example.com) and possibly also to external posters
W is authorised to the group (group+@example.com)
R is authorised to the group (group+@example.com)
P is authorised to nobody
K is authorised to the group (group+@example.com)
O is authorised to the group (group+@example.com)
V is authorised to everyone (@.)

Note that for publicly writeable lists, there are a few possible changes:

C consists of anyone, namely @.
W consists of anyone, namely @.
O consists of anyone, namely @.

Note that for publicly readable lists, there are a few possible changes:

R consists of anyone, namely @.
P consists of anyone, namely @.
K consists of anyone, namely @.

Database Storage for Groups

TODO: This is a general concept, namely that of an Resource ACL. Write it like that in a separate specification document, acl-impl-resource and reference it from here. Describe it as a list of instances with aliases and access rights for each; mention that the instance might be mentioned in the key for another usage pattern, namely one that queries instead of iterates. That query form however, may be more generally useful and it also leads to less facilitation for services (and more privacy).

The group database is in many ways similar to the Communication ACL and Resource ACL, and it may in fact be merged into the same database because it too uses a keyed hashing scheme, and the hashed values do not overlap.

Note that the keys for a group are not shared with the local identities in the Communication ACL. This may sound like it saves time to try both a user and a group, but in reality it adds time and/or space complexity because the Communication ACL incorporates the remote identity or a generalisation thereof. This is generally undesirable for groups, which have remote patterns setup in the value retrieved from the database, where they may actually form a list.

The information stored is different for pull services and push services. Mixed services may require both.

Resource ACL for Pull Group Services

Pull services do not need a member list, and it is likely that they will not be provided when we separate the IdentityHub and ServiceHub components in our third project phase. What pull services need, is simply access control to the resources held in the group. Clients will connect and create, update, read and delete elements, but always at the initiative of the client.

This means that an ACL suffices to decide about access to the shared resources. The group is assigned a standard Resource UUID (TODO: WHICH GROUP/ROLE) and an additional Resource Instance UUID distinguihshes the group from other groups. This information is used in the usual manner to decide on resource access through a Resource ACL.

TODO: Do we want to use distinct Resource UUID values depending on the service?

TODO: Where can a pull service find the Resource Instance UUID? [Might a push service benefit from it for group DB key encryption?]

Key-Value Database for Push Service Groups

For push services, there is a need to access a member list, iterate over its elements and actually push out resources when they are pushed in by a member. Think of an email list as a simple example, but keep in mind that the actual mechanism is generic.

Values describing Push Service Groups

Push services require a method to iterate over list members. The database records for doing this is stored as a value in a key-value database. The contents of that value follow now.

The values stored for a group consists of lines ended by a single U+000A character (line feed, or '\n'). The first line is configuration data, the remainder of lines define access rights for +member identities and their delivery addresses.

The first line is the configuration line. For now, it holds two words separated by a space U+0020, but in the future there may be extra words separated by more spaces U+0020; these intermediate words must be ignored when they occur. The currently defined words are:

the first word starts with either the capitalised ASCII character "G" or U+0047 to indicate a group description, or "R" or U+0052 to indicate a role description
the last word holds 3 "at" characters U+0040 with the letters of Access Rights for Group Membership between the first and second, and the letters of Access Rights for Group Data between the second and third TODO: Or is it better to share a UUID here, pointing to a Resource ACL?

The last word's Access Rights are applied to external addresses, so non-member addresses; note that an external address may still have a member address. Non-member addresses are easily recognised as being senders whose address local part does not start with the group name, or whose domain differs from that of the group. To submit to the group from an external entity, usually CWO rights will be required by the service; to receive information from the group, usually RPK rights will be required; services may require read access for all who attempt to establish write access, but since individual members may also do just one of these things, it is possible for non-member addresses to be just readers, or just writers.

The lines after the first configuration line are either:

a line with 3 "at" characters U+0040 with the letters of Access Rights for Group Membership between the first and second, and the letters of Access Rights for Group Data between the second and third
a line with a member name as +member, then a space U+0020 and a delivery address for the member, user+alias for a local user or user@example.net for a remote user.

Until the first line with Access Rights for Group Data has been processed, the one at the end of the first line applies. After such a line has been processed, further lines are assigned there rights, until new ones are provided. Usually, this means that the second line is an access rights line, to avoid assigning the rights for external entities to a member.

The membership list can be extended by anyone with the C privilege in the Access Rights for Group Membership. Such extensions are +member to delivery address mappings that will be appended as extra lines at the end of this record. In other words, the last line of Access Rights for Group Data automatically applies to these new entries.

The mappings from +member to a delivery address can be processed in-order, and need not be applied in reverse. A reverse mapping is useful for privacy, but only for a content originator (or "sender") address, which is handled through the Communication ACL as it redirects the communication attempt to group+member. This even works for external addresses, for whom the group+member local part is relative to the target domain, as it is for a domain's local users. The group handling software should care for any technical content reference to the sender address, and change it into the corresponding group+member@example.com address.

Only guest access is a little different; guests that pass through the group access rights without being kicked out will not have their identity concealed.

A push service can iterate over this database value and process each entry in turn, bestowing each with its pushy habit. While doing this, it applies a few simple rules:

non-group addresses, when permitted by the list setup, may be copied in by including them as targets
one user may have multiple group member addresses, provided that each has its own delivery address (usually, its own +alias in the local part); these are treated as independent addresses
group member addresses may be frivolous expressions, and are interpreted as set descriptors, bounding the individual member identities in the group description lines
when interpreting frivolous expressions, the delivery addresses with R rights are always passed; delivery addresses without R rights are only passed when their +member is explicitly mentioned in the set of targeted addresses
there is no risk of delivering more than once to the same recipient; not because the list mechanism keeps track of recipients, but because iteration is over group members who are then matched against a target descriptive set

The structures are not defined for groups with thousands of members, though this mechanism may actually scale well enough. The expectation is that the conceptual simplicity relative to full-featured mailing lists and such become a limitation well before the linear simplicity of these lists.

For one, there is no automated handling for bounces, in services such as email. Having said that, plugin services could be imagined that do just that. Normally however, an email to a group would bounce to the original sender if a recipient is not reachable.

Keys describing Groups

Groups are indexed by a key that holds their group name, group@example.com constructed in much the same way as for Communication ACL and Resource ACL keys.

As before, a preparation is made in phase one of the keyed hash, based on a key for the IdentityHub that supplies the domain setup. Then, in a second phase hash, a string is hashed, with at least one x character attached until the number of bytes for the phase 1 hash and the string plus padding is a multiple of the hash algorithm's block size. The string to use is represented as a hexdump below:

00000000  47 52 4f 55 50 20 4d 45  4d 45 52 20 4f 52 20 52  |GROUP MEMER OR R|
00000010  4f 4c 45 20 4f 43 43 55  50 41 4e 54 20 4c 49 53  |OLE OCCUPANT LIS|
00000020  54 20                                             |T |
00000022

After this string the normalised group name will be appended. Since it is a local name, anything from the first + symbol will be trimmed off the local part. This includes the member name and frivolous expressions; it may be compared to removing an alias from a user name. There is an exception, as with users, namely when the local part ends in a + symbol to form ...+stat+DYN+ in which case it will be reduced to ...+stat++ in support for special constructs for allowing others access to a group or role, for example based on a temporary address.

Encryption of Group Descriptions

The contents of group member lists (or role occupants lists) are subject to privacy concerns, possibly even more than aliases and contact lists. The same approach as for Communication ACL and Resource ACL are in place here, based on the same string attached to the unfinished hash algorithm for the key computation.

Group Communication ACL

Regardless of the push or pull nature of a group, a Communication ACL is assumed to limit who may access the group, just as is the case with communication access to users. Note that the Communication ACL is concerned with Access Rights for Group Data, not for Group Membership. It is assumed that subscription occurs outside the regular group communication channels.

The group has an address, and communication to the group is constrained using the same kind of ACLs that apply to plain user addresses. But unlike for user accounts, it is not an alias that will be delivered, but any original address as requested, as long as it starts with the group name. That start has already been represented in the database key for the Communication ACL lookup, so any extension can be accepted. To that end, the Communication ACL has a form + to pass the complete and originally targeted address. Since it is clear from that notation that either a group or role is implicated, the lookups continue as described herein.

What we still need to define, is what entries are white listed and black listed by the group's Communication ACL.

The black list forbids group access by anyone, so @. unless the group is open for public reading and/or writing.

The white list grants access from all member names, so to group+@example.com. This indicates that users' addresses must employ their authorisations to be mapped to a group+member@example.com form. Similarly, external addresses are listed but must be mapped to their group+member@example.com name. In both cases, the entry in the ACL is group+member and the domain name is provided by the targeted local address group@example.com after any part after a + has been taken out during normalisation.

Responsibilities of the IdentityHub

The information describing groups and the various ACLs is managed from the IdentityHub. This means that it is the task for the IdentityHub to collect the information from various inputs and output it in the various databases (and apply encryption for the various services).

The duality of push and pull services is supported by the IdentityHub, which takes care of setup both from the same group membership and role occupancy relations. Consistency between these dual interfaces is also the responsibility of the IdentityHub.

Finally, the IdentityHub is the interface through which users interact with domain configuration. There is at least one kind of interface that can be used to add and remove groups, and for each, to manage members. There is at least one (default) profile for new groups and (if useful) new memberships, to simplify the actions involved. Alternatives (like what is public and what not) are presented (at least) while setting up the group or role.

ARPA2 Work Package

These facilities are generic, and can be implemented in libraries that load into a diversity of services. This is helpful in adding the value of the IdentityHub structures to all those.

The key-value database handling, including encryption and decryption, is a generic layer that serves the group/role mechanisms as well as the Communication ACL and Resource ACL.
The pull service mechanism is supported with the Resource ACL mechanism, and must merely be integrated into the group and role forms of identity.
The push service mechanism is new, and needs a generic support library. Probably with callbacks for each <member,delivery_address> pair.
We shall need to look into per-alias settings such as a common-name override. This may be located in another ARPA2 Work Package.
We shall need to apply the push mechanism in a module that expands MTA functionality. This would best be done in an easily portable form, such as an SMTP service or maybe even a Milter.
There is a need to validate access rights for individual members.
As part of the IdentityHub, generate the data in the ServiceHub as per these specifications.
TODO: Complete? Overview?