CVE-2021-20288: Unauthorized global_id reuse in cephx
Summary
Ceph was not ensuring that reconnecting/renewing clients were presenting an existing ticket when reclaiming their global_id value. An attacker that was able to authenticate could claim a global_id in use by a different client and potentially disrupt other cluster services.
Background
Each authenticated client or daemon in Ceph is assigned a numeric global_id identifier. That value is assumed to be unique across the cluster. When clients reconnect to the monitor (e.g., due to a network disconnection) or renew their ticket, they are supposed to present their old ticket to prove prior possession of their global_id so that it can be reclaimed and thus remain constant over the lifetime of that client instance.
Ceph was not correctly checking that the old ticket was valid, allowing an arbitrary global_id to be reclaimed, even if it was in use by another active client in the system.
Attacker Requirements
Any potential attacker must:
have a valid authentication key for the cluster
know or guess the global_id of another client
run a modified version of the Ceph client code to reclaim another client’s global_id
construct appropriate client messages or requests to disrupt service or exploit Ceph daemon assumptions about global_id uniqueness
Impact
Confidentiality Impact
None
Integrity Impact
Partial. An attacker could potentially exploit assumptions around global_id uniqueness to disrupt other clients’ access or disrupt Ceph daemons.
Availability Impact
High. An attacker could potentially exploit assumptions around global_id uniqueness to disrupt other clients’ access or disrupt Ceph daemons.
Access Complexity
High. The client must make use of modified client code in order to exploit specific assumptions in the behavior of other Ceph daemons.
Authentication
Yes. The attacker must also be authenticated and have access to the same services as a client it is wishing to impersonate or disrupt.
Gained Access
Partial. An attacker can partially impersonate another client.
Affected versions
All prior versions of Ceph monitors fail to ensure that global_id reclaim attempts are authentic.
In addition, all user-space daemons and clients starting from Luminous v12.2.0 were failing to securely reclaim their global_id following commit a2eb6ae3fb57 (“mon/monclient: hunt for multiple monitor in parallel”).
All versions of the Linux kernel client properly authenticate.
Fixed versions
Pacific v16.2.1 (and later)
Octopus v15.2.11 (and later)
Nautilus v14.2.20 (and later)
Fix details
Patched monitors now properly require that clients securely reclaim their global_id when the
auth_allow_insecure_global_id_reclaim
isfalse
. Initially, by default, this option is set totrue
so that existing clients can continue to function without disruption until all clients have been upgraded. When this option is set to false, then an unpatched client will not be able to reconnect to the cluster after an intermittent network disruption breaking its connect to a monitor, or be able to renew its authentication ticket when it times out (by default, after 72 hours).Patched monitors raise the
AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED
health alert ifauth_allow_insecure_global_id_reclaim
is enabled. This health alert can be muted with:ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w
Although it is not recommended, the alert can also be disabled with:
ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false
Patched monitors can disconnect new clients right after they have authenticated (forcing them to reconnect and reclaim) in order to determine whether they securely reclaim global_ids. This allows the cluster and users to discover quickly whether clients would be affected by requiring secure global_id reclaim: most clients will report an authentication error immediately. This behavior can be disabled by setting
auth_expose_insecure_global_id_reclaim
tofalse
:ceph config set mon auth_expose_insecure_global_id_reclaim false
Patched monitors will raise the
AUTH_INSECURE_GLOBAL_ID_RECLAIM
health alert for any clients or daemons that are not securely reclaiming their global_id. These clients should be upgraded before disabling theauth_allow_insecure_global_id_reclaim
option to avoid disrupting client access.By default (if
auth_expose_insecure_global_id_reclaim
has not been disabled), clients’ failure to securely reclaim global_id will immediately be exposed and raise this health alert. However, ifauth_expose_insecure_global_id_reclaim
has been disabled, this alert will not be triggered for a client until it is forced to reconnect to a monitor (e.g., due to a network disruption) or the client renews its authentication ticket (by default, after 72 hours).The default time-to-live (TTL) for authentication tickets has been increased from 12 hours to 72 hours. Because we previously were not ensuring that a client’s prior ticket was valid when reclaiming their global_id, a client could tolerate a network outage that lasted longer than the ticket TTL and still reclaim its global_id. Once the cluster starts requiring secure global_id reclaim, a client that is disconnected for longer than the TTL may fail to reclaim its global_id, fail to reauthenticate, and be unable to continue communicating with the cluster until it is restarted. The default TTL was increased to minimize the impact of this change on users.
Recommendations
Users should upgrade to a patched version of Ceph at their earliest convenience.
Users should upgrade any unpatched clients at their earliest convenience. By default, these clients can be easily identified by checking the
ceph health detail
output for theAUTH_INSECURE_GLOBAL_ID_RECLAIM
alert.If all clients cannot be upgraded immediately, the health alerts can be temporarily muted with:
ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM 1w # 1 week ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w # 1 week
After all clients have been updated and the
AUTH_INSECURE_GLOBAL_ID_RECLAIM
alert is no longer present, the cluster should be set to prevent insecure global_id reclaim with:ceph config set mon auth_allow_insecure_global_id_reclaim false