Sam Tunnicliffe

Cassandra has some great advantages over mainstream database systems in core areas such as scaling, resilience and performance. However, given its relative youth, it has understandably lagged behind products with decades of development in a few places. One of those is permissions management, where the upcoming 2.2 release will continue to narrow the feature gap by adding significant new functionality to the authentication and authorization subsystem.

<h2>Introducing Roles</h2>

Cassandra has supported pluggable user and permissions management since its very&nbsp;<a href="https://issues.apache.org/jira/browse/CASSANDRA-547">early</a>&nbsp;<a href="https://issues.apache.org/jira/browse/CASSANDRA-1237">versions</a>&nbsp;and this has&nbsp;<a href="https://issues.apache.org/jira/browse/CASSANDRA-4490">evolved</a>&nbsp;<a href="https://issues.apache.org/jira/browse/CASSANDRA-5003">significantly</a>&nbsp;<a href="https://issues.apache.org/jira/browse/CASSANDRA-4874">over</a>&nbsp;<a href="https://issues.apache.org/jira/browse/CASSANDRA-4490">time</a>. In 1.2.2 we&nbsp;<a href="https://issues.apache.org/jira/browse/CASSANDRA-4898">began including</a>&nbsp;CQL based, internal authenticator and authorizer implementations in the core distribution. As part of a&nbsp;<a href="https://issues.apache.org/jira/browse/CASSANDRA-8394">broad reworking</a>&nbsp;of the auth subsystem, Cassandra 2.2 will introduce a number of further enhancements in this area.

<a href="https://issues.apache.org/jira/browse/CASSANDRA-7653">One specific improvement</a>&nbsp;is to replace the simplistic approach of managing permissions on an individual user basis with something much more powerful and flexible, through&nbsp;<a href="http://en.wikipedia.org/wiki/Role-based_access_control">role based access control (RBAC)</a>. Under this new scheme, permissions are granted to a role just as they were previously granted to a user, the key difference is that roles can also be granted to each other. So in this context we can think of them as groups, rather than individuals. This greatly simplifies permissions management for administrators by allowing related privileges to be bundled together by granting them to roles, which can in turn then be assigned to specific database users. Some new constructs have been added to the CQL syntax to support this. For example, a simple scenario looks something like this:

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			1

			2

			3

			4

			5

			6

			7

			8

			9

			10

			11

			12

			13

			14

			15

			16

			17

			18

			19

			20

			21

			22
			</td>
			<td>
			<code>CREATE</code> <code>KEYSPACE warehouse </code><code>WITH</code> <code>REPLICATION = {</code><code>'class'</code><code>:</code><code>'SimpleStrategy'</code><code>, </code><code>'replication_factor'</code><code>:1};</code>

			<code>USE warehouse;</code>

			&nbsp;

			<code>CREATE</code> <code>TABLE</code> <code>addresses (</code>

			<code>&nbsp;&nbsp;</code><code>customer_id </code><code>bigint</code><code>,</code>

			<code>&nbsp;&nbsp;</code><code>address_id </code><code>int</code><code>,</code>

			<code>&nbsp;&nbsp;</code><code>address text,</code>

			<code>&nbsp;&nbsp;</code><code>PRIMARY</code> <code>KEY</code> <code>(customer_id, address_id)</code>

			<code>);</code>

			&nbsp;

			<code>CREATE</code> <code>TABLE</code> <code>orders (</code>

			<code>&nbsp;&nbsp;</code><code>customer_id </code><code>bigint</code><code>,</code>

			<code>&nbsp;&nbsp;</code><code>order_id timeuuid,</code>

			<code>&nbsp;&nbsp;</code><code>product_id uuid,</code>

			<code>&nbsp;&nbsp;</code><code>product_description text,</code>

			<code>&nbsp;&nbsp;</code><code>PRIMARY</code> <code>KEY</code> <code>(customer_id, order_id, product_id)</code>

			<code>);</code>

			&nbsp;

			<code>CREATE</code> <code>ROLE supervisor;</code>

			&nbsp;

			<code>GRANT</code> <code>MODIFY</code> <code>ON</code> <code>warehouse.orders </code><code>TO</code> <code>supervisor;</code>

			<code>GRANT</code> <code>SELECT</code> <code>ON</code> <code>warehouse.addresses </code><code>TO</code> <code>supervisor;</code>
			</td>
		</tr>
	</tbody>
</table>

So now we have a Role, supervisor, with the necessary permissions to read and write from the two tables. When we have a new database user that we want to be able to act as a supervisor, we just grant them that Role.

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			1

			2
			</td>
			<td>
			<code>CREATE</code> <code>ROLE pam </code><code>WITH</code> <code>PASSWORD</code> <code>= </code><code>'password'</code> <code>AND</code> <code>LOGIN = </code><code>true</code><code>;</code>

			<code>GRANT</code> <code>supervisor </code><code>TO</code> <code>pam;</code>
			</td>
		</tr>
	</tbody>
</table>

Let's examine those last two statements. The first creates another role, named&nbsp;<code>pam</code>&nbsp;and sets its&nbsp;<code>LOGIN</code>&nbsp;attribute to true. As you might expect, this is what enables a database user to actually identify as this role when logging in via a client such as cqlsh. We also assigned Pam a password as we're using Cassandra's internal password authentication mechanism. There's actually one other attribute we could set when creating a new role. We specify superuser status at the role level, which we would do by adding&nbsp;<code>AND SUPERUSER = true</code>&nbsp;to the&nbsp;<code>CREATE ROLE</code>&nbsp;statement. Finally, note that anything that can be set in&nbsp;<code>CREATE ROLE</code>&nbsp;can be modified later using&nbsp;<code>ALTER ROLE</code>&nbsp;(so we could retrospectively make Pam a superuser if we choose). Pam now is permitted to do all the things a supervisor can do:

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			1

			2

			3

			4

			5

			6
			</td>
			<td>
			<code>LIST </code><code>ALL</code> <code>PERMISSIONS </code><code>OF</code> <code>pam;</code>

			&nbsp;

			<code>&nbsp;</code><code>role&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | username&nbsp;&nbsp; | resource&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | permission</code>

			<code>------------+------------+-----------------------------+------------</code>

			<code>&nbsp;</code><code>supervisor | supervisor | &lt;</code><code>table</code> <code>warehouse.addresses&gt; |&nbsp;&nbsp;&nbsp;&nbsp; </code><code>SELECT</code>

			<code>&nbsp;</code><code>supervisor | supervisor |&nbsp;&nbsp;&nbsp; &lt;</code><code>table</code> <code>warehouse.orders&gt; |&nbsp;&nbsp;&nbsp;&nbsp; </code><code>MODIFY</code>
			</td>
		</tr>
	</tbody>
</table>

(the&nbsp;<code>username</code>&nbsp;column is simply to provide backward compatibility with the results of&nbsp;<code>LIST PERMISSIONS</code>&nbsp;in previous releases).

If we were to add a new table to which supervisors require access, we would simply grant the necessary permissions on it to the supervisor role and Pam, along with all other users assigned the role, would automatically acquire them.

We can go further though, let's create another role and grant it some permissions all of the tables in another keyspace. Then, we'll assign our new role to Pam.

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			1

			2

			3

			4
			</td>
			<td>
			<code>CREATE</code> <code>ROLE office_admin;</code>

			<code>GRANT</code> <code>SELECT</code> <code>ON</code> <code>KEYSPACE office </code><code>TO</code> <code>office_admin;</code>

			<code>GRANT</code> <code>MODIFY</code> <code>ON</code> <code>KEYSPACE office </code><code>TO</code> <code>office_admin;</code>

			<code>GRANT</code> <code>office_admin </code><code>TO</code> <code>pam;</code>
			</td>
		</tr>
	</tbody>
</table>

And if we list Pam's permissions, we'll see they represent the aggregate of those granted to her roles.

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			1

			2

			3

			4

			5

			6

			7

			8
			</td>
			<td>
			<code>LIST </code><code>ALL</code> <code>PERMISSIONS </code><code>OF</code> <code>pam;</code>

			&nbsp;

			<code>&nbsp;</code><code>role&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | username&nbsp;&nbsp;&nbsp;&nbsp; | resource&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | permission</code>

			<code>--------------+--------------+-----------------------------+------------</code>

			<code>&nbsp;</code><code>office_admin | office_admin |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;keyspace office&gt; |&nbsp;&nbsp;&nbsp;&nbsp; </code><code>SELECT</code>

			<code>&nbsp;</code><code>office_admin | office_admin |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;keyspace office&gt; |&nbsp;&nbsp;&nbsp;&nbsp; </code><code>MODIFY</code>

			<code>&nbsp;&nbsp;&nbsp;</code><code>supervisor |&nbsp;&nbsp; supervisor | &lt;</code><code>table</code> <code>warehouse.addresses&gt; |&nbsp;&nbsp;&nbsp;&nbsp; </code><code>SELECT</code>

			<code>&nbsp;&nbsp;&nbsp;</code><code>supervisor |&nbsp;&nbsp; supervisor |&nbsp;&nbsp;&nbsp; &lt;</code><code>table</code> <code>warehouse.orders&gt; |&nbsp;&nbsp;&nbsp;&nbsp; </code><code>MODIFY</code>
			</td>
		</tr>
	</tbody>
</table>

Likewise, we can ask which roles Pam has been assigned.

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			1

			2

			3

			4

			5

			6

			7
			</td>
			<td>
			<code>LIST ROLES </code><code>OF</code> <code>pam;</code>

			&nbsp;

			<code>&nbsp;</code><code>role&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | super | login | options</code>

			<code>--------------+-------+-------+---------</code>

			<code>&nbsp;</code><code>office_admin | </code><code>False</code> <code>| </code><code>False</code> <code>|&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {}</code>

			<code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code>pam | </code><code>False</code> <code>|&nbsp; </code><code>True</code> <code>|&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {}</code>

			<code>&nbsp;&nbsp;&nbsp;</code><code>supervisor | </code><code>False</code> <code>| </code><code>False</code> <code>|&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {}</code>
			</td>
		</tr>
	</tbody>
</table>

<h2>Inheritance and Hierarchies</h2>

As you can see, roles inherit the permissions of any other roles that they are granted. In the example above, the hierarchy of roles is extremely simple, but that need not be the case. It is perfectly possible to construct a much deeper structure, meaning admins can make permissions as fine grained as necessary without incurring a huge administrative burden. One last thing to note regarding inheritance, whilst permissions and superuser status are inherited, the&nbsp;<code>LOGIN</code>&nbsp;attribute is not. In order for database users to identify as a particular role at login, that role must have its&nbsp;<code>LOGIN</code>&nbsp;attribute set to true, this prevents users inadvertently logging in under the identity of a group, like&nbsp;<code>supervisor</code>.

<h2>Automatic Granting of Permissions</h2>

Another interesting aspect to this is that the creator of a role (the role the database user who issues the CREATE ROLE statement is logged in as), is automatically granted permissions on it. This enables users with role-creation privileges to also manage the roles they create, allowing them to ALTER, DROP, GRANT and REVOKE them. This automatic granting of 'ownership' permissions isn't limited to roles either, it also applies to database objects such as keyspaces, tables (and soon to user defined functions). This largely removes the requirement to have any active superuser roles, which reduces the risk of privilege escalation. See&nbsp;<a href="https://issues.apache.org/jira/browse/CASSANDRA-7216">CASSANDRA-7216</a>&nbsp;and&nbsp;<a href="https://issues.apache.org/jira/browse/CASSANDRA-8650">CASSANDRA-8650</a>&nbsp;for full details.

<h2>Under the Hood</h2>

At the implementation level, one aspect of this rework is to clarify the responsibilities of the various components. For instance, the methods handling user management have been moved from IAuthenticator to the new IRoleManager interface, leaving IAuthenticator implementations responsible purely for validation of credentials supplied during login. A nice side effect of this is that where an external authentication mechanism is used, we no longer have the requirement to create and manage users/roles directly in Cassandra as well as in the external system. By providing a custom IRoleManager implementation, user management and authentication can be completely delegated.

Of course, these changes to the user management model do require implementers of custom auth providers to make some changes to their code, but these should be fairly limited and straightforward. Check out the changes to&nbsp;<a href="http://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java">PasswordAuthenticator.java</a>&nbsp;and&nbsp;<a href="https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java">CassandraAuthorizer.java</a>&nbsp;as well as the new&nbsp;<a href="https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java">CassandraRoleManager.java</a>&nbsp;class in the 2.2 source tree for some pointers.

<h2>Upgrading</h2>

For systems already using the internal auth implementations, the process for converting existing data during a rolling upgrade is straightforward. As each node is restarted, it will attempt to convert any data in the legacy tables into the new schema. Until enough nodes to satisfy the replication strategy for the&nbsp;<code>system_auth</code>&nbsp;keyspace are upgraded and so have the new schema, this conversion will fail with the failure being reported in the system log. During the upgrade, Cassandra's internal auth classes will continue to use the legacy tables, so clients experience no disruption. Issuing&nbsp;<a href="http://en.wikipedia.org/wiki/Data_control_language">DCL</a>&nbsp;statements during an upgrade is not supported. Once all nodes are upgraded, an operator with superuser privileges should drop the legacy tables&nbsp;<code>system_auth.users</code>,&nbsp;<code>system_auth.credentials</code>&nbsp;and&nbsp;<code>system_auth.permissions</code>. Doing so will prompt Cassandra to switch over to the new tables without requiring any further intervention.

A successful data conversion will report in&nbsp;<code>system.log</code>&nbsp;like so:

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			<code>INFO&nbsp; [OptionalTasks:1] CassandraRoleManager.java:410 - Converting legacy users</code>

			<code>INFO&nbsp; [OptionalTasks:1] CassandraRoleManager.java:420 - Completed conversion of legacy users</code>

			<code>INFO&nbsp; [OptionalTasks:1] CassandraRoleManager.java:425 - Migrating legacy credentials data to new system table</code>

			<code>INFO&nbsp; [OptionalTasks:1] CassandraRoleManager.java:438 - Completed conversion of legacy credentials</code>

			<code>INFO&nbsp; [OptionalTasks:1] CassandraAuthorizer.java:396 - Converting legacy permissions data</code>

			<code>INFO&nbsp; [OptionalTasks:1] CassandraAuthorizer.java:435 - Completed conversion of legacy permissions</code>
			</td>
		</tr>
	</tbody>
</table>

While the legacy tables are present a restarted node will re-run the data conversion and report the outcome so that operators can verify that it is safe to drop them.

As I mentioned, this is just a part of a wider reworking of the auth subsystem in Cassandra planned for inclusion in the 2.2 release. You can check out more detail and follow progress in&nbsp;<a href="https://issues.apache.org/jira/browse/CASSANDRA-8394">CASSANDRA-8394</a>.

Role Based Access Control In Cassandra

Sam Tunnicliffe

Discover more

Share

Share

Introducing Roles

Inheritance and Hierarchies

Automatic Granting of Permissions

Under the Hood

Upgrading

More Technology

Knowledge Graphs for RAG without a GraphDB

How Winweb Built its AI Assistant with DataStax Astra DB and LangChain

Vercel + Astra DB: Get Data into Your GenAI Apps Fast

Simplifying Agent Development with Astra DB Connector for Vertex AI Search

One-stop Data API for Production GenAI