Container orchestration is the business of Kubernetes. Derived from the Greek word meaning helmsman, Kubernetes is your pilot to help you navigate through the waters of containerized workloads and services, facilitating both declarative configuration and automation.  To achieve its purpose, Kubernetes needs a helper called etcd.

What is etcd?

Etcd is defined as “a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines.” One of the most notable uses is the management of configuration data, state data and metadata for Kubernetes. Etcd is very popular as it serves many qualities according to IBM:

Fully replicated: every node in an etcd cluster has access to the full data store. Highly available: there is no single point of failure and it gracefully tolerates hardware failures and network partitions Reliably consistent: every data “read” returns the latest data “write” across all clusters Fast: supports 10,000 writes per second Simple: any application, from simple web apps to complex container orchestration engines such as Kubernetes, can read or write data to etcd using standard HTTP/JSON  tools Secure: etcd supports automatic transport layer security (TLS) and optional secure socket layer (SSL) client certificate authentication

We will focus on security.

Why we need to secure etcd

Etcd plays a key role in hosting Kubernetes-related data and may contain sensitive information such as access credentials and private keys associated with digital certificates. These credentials are lucrative targets for malicious actors and should always be protected.  Compromised or stolen credentials are the key source of successful data breaches. Additionally, they can be used to impersonate legitimate entities or create rogue certificates to access application data and spread malware. Traditionally, this sensitive and crucial data was not stored in a centralized manner and was owned by established teams responsible for data protection and safeguarding. However, in Kubernetes platforms these credentials are now stored in etcd. Because of the sensitivity and how critical the data is, it is important to harden the etcd store. The Center for Internet Security (CIS) has developed benchmarks for hardening and securing Kubernetes. These guidelines suggest using TLS-based authentication to allow (or not) access to etcd. In the words of CIS: “Its access should be restricted to specifically designated clients and peers only. Authentication to etcd is based on whether the certificate presented was issued by a trusted certificate authority. There is no checking of certificate attributes such as common name or subject alternative name. As such, if any attackers were able to gain access to any certificate issued by the trusted certificate authority, they would be able to gain full access to the etcd database.” However, using TLS authentication is not sufficient. Private keys can be compromised, and attackers can impersonate authorized users to steal data. Hence, there is a need for a defense-in-depth approach. Except for access control, we must encrypt the data at rest in etcd.

Compliance requirements and encryption

This is a requirement that stems from various security and privacy regulations, such as PCI DSS, HIPAA and GDPR.  For example, the HIPAA Security Rule states that ePHI is “rendered unusable, unreadable or indecipherable to unauthorized individuals” if it has been encrypted by “the use of an algorithmic process to transform data into a form in which there is a low probability of assigning meaning without use of a confidential process or key and such confidential process or key that might enable decryption has not been breached. To avoid a breach of the confidential process or key, these decryption tools should be stored on a device or at a location separate from the data they are used to encrypt or decrypt.” StackRox notes that “by default, data stored in etcd is not encrypted at rest in the OpenShift Container Platform.” They go on suggesting that “etcd encryption can be enabled in the cluster to effectively provide an additional layer of data security.” According to RedHat OpenShift documentation page, when we encrypt data at rest in etcd, the following server resources are encrypted:

Secrets ConfigMaps Routes OAuth access tokens OAuth authorized tokens

Protect encryption keys for robust security

To encrypt etcd, public-key encryption is employed, where a pair of cryptographic keys are created. While public keys are publicly available, the private keys should be stored adequately to avoid being stolen or compromised. For this reason, the use of hardware security modules (HSMs) is highly recommended. NIST, CISA and other organizations are stressing the need for using cloud-based HSMs to securely store and protect private keys. At the same time, organizations must establish and enforce policies and best practices for managing the lifecycle of encryption keys. When etcd encryption is enabled, encryption keys are created. These keys need to rotate frequently, and key custodians need to have visibility of the keys and certificates used to safeguard their containers.  With the number of keys skyrocketing, opting for a manual solution to key and certificate management is a recipe for destruction. The best approach is through automating processes and providing early warning notifications to address imminent expirations or cryptographic issues, such as key compromise.  

Sources

Etcd Etcd, IBM  Securing Kubernetes, CIS Benchmarks  OpenShift security best practices for K8s cluster design, StackRox Encrypting etcd data, Red Hat OpenShift