Cloud KMS & Secret Management

One-line summary: Deep dive into Cloud KMS key management, Secret Manager, envelope encryption, and how to design secure key and secret management.

Prerequisites: IAM Evaluation Model, Basic cryptography concepts (encryption, keys, secrets).


Mental Model

Key Management Architecture

flowchart TB App[Application] --> SM[Secret Manager] App --> KMS[Cloud KMS] SM --> KMSKey[KMS Key
Encryption] KMS --> KMSKey KMSKey --> HSM[Hardware Security Module
HSM] App --> DataKey[Data Encryption Key
DEK] DataKey --> KMSKey style SM fill:#99ccff style KMS fill:#ffcc99 style HSM fill:#99ff99

Key insight: Cloud KMS manages encryption keys, while Secret Manager stores secrets encrypted with KMS keys. Understanding envelope encryption and key rotation is critical for security.

Encryption Model

Envelope Encryption: Encrypt data with data encryption key (DEK), encrypt DEK with key encryption key (KEK).

Components: - KEK: Key Encryption Key (managed by KMS) - DEK: Data Encryption Key (used to encrypt data) - Ciphertext: Encrypted data + encrypted DEK

Benefits: - Performance: Encrypt/decrypt data with DEK (fast) - Security: Protect DEK with KEK (secure) - Rotation: Rotate KEK without re-encrypting data


Internals & Architecture

Cloud KMS

Key Hierarchy

Key Ring: Container for keys (regional).

Key: Encryption key (symmetric or asymmetric).

Key Version: Version of key (for rotation).

Structure:

Project
  └── Key Ring (us-central1)
      └── Key (my-key)
          ├── Version 1 (primary)
          ├── Version 2
          └── Version 3

Key Types

Symmetric Keys: Same key for encryption and decryption. - Use case: Encrypt data, encrypt DEKs - Algorithm: AES-256 - Performance: Fast encryption/decryption

Asymmetric Keys: Different keys for encryption and decryption. - Use case: Signing, encryption - Algorithm: RSA, EC - Performance: Slower than symmetric

Key Versions

Versions: Multiple versions of same key.

Primary version: Active version for encryption.

Use cases: - Rotation: Rotate keys without downtime - Rollback: Rollback to previous version if needed - Audit: Track key usage per version

Hardware Security Module (HSM)

HSM: Hardware security module for key storage.

Types: - Software: Software-based HSM (default) - Hardware: Hardware-based HSM (Cloud HSM)

Benefits: - Security: Keys never leave HSM - Compliance: Meet compliance requirements - Performance: Hardware acceleration

Secret Manager

Secret Storage

Secrets: Sensitive data (passwords, API keys, certificates).

Storage: Encrypted at rest with KMS keys.

Access: Access via IAM policies.

Versioning: Multiple versions of secrets.

Secret Access

Access patterns: - Application: Applications access secrets via API - IAM: IAM policies control access - Audit: All access logged

Best practices: - Least privilege: Grant minimum necessary access - Rotation: Rotate secrets regularly - Monitoring: Monitor secret access

Envelope Encryption

Process

Encryption: 1. Generate DEK: Generate data encryption key 2. Encrypt data: Encrypt data with DEK 3. Encrypt DEK: Encrypt DEK with KEK (KMS) 4. Store: Store encrypted data + encrypted DEK

Decryption: 1. Retrieve: Retrieve encrypted data + encrypted DEK 2. Decrypt DEK: Decrypt DEK with KEK (KMS) 3. Decrypt data: Decrypt data with DEK 4. Return: Return decrypted data

Benefits

Performance: Encrypt/decrypt data with DEK (fast, local).

Security: Protect DEK with KEK (secure, KMS).

Rotation: Rotate KEK without re-encrypting data (only re-encrypt DEK).

Key Rotation

Automatic Rotation

Automatic rotation: KMS automatically rotates keys.

Configuration: - Rotation period: How often to rotate (e.g., 90 days) - Rotation schedule: When to rotate

Process: 1. Generate: Generate new key version 2. Promote: Promote new version to primary 3. Deprecate: Deprecate old versions 4. Destroy: Destroy old versions after grace period

Manual Rotation

Manual rotation: Manually rotate keys.

Process: 1. Create: Create new key version 2. Promote: Promote new version to primary 3. Re-encrypt: Re-encrypt data with new key (if needed) 4. Deprecate: Deprecate old versions 5. Destroy: Destroy old versions

Performance Characteristics

Latency

KMS operations: - Encrypt: P95 < 100ms - Decrypt: P95 < 100ms - Sign: P95 < 100ms - Verify: P95 < 100ms

Secret Manager: - Access: P95 < 100ms - Factors: Network latency, KMS latency

Throughput

KMS throughput: - Operations: Thousands of operations per second - Scaling: Scales with key usage

Secret Manager throughput: - Access: Thousands of accesses per second - Scaling: Scales with usage


Failure Modes & Blast Radius

KMS Failures

Scenario 1: Service Outage

Scenario 2: Key Unavailable

Scenario 3: Key Compromise

Secret Manager Failures

Scenario 1: Secret Unavailable

Scenario 2: Secret Leakage

Performance Failures

Scenario 1: High Latency

Overload Scenarios

10× Normal Load

100× Normal Load


Observability Contract

Metrics to Track

KMS Metrics

Secret Manager Metrics

Logs

KMS logs: - Key operations (encrypt, decrypt, sign, verify) - Key access - Admin activity logs - Error logs

Secret Manager logs: - Secret access - Admin activity logs - Error logs

Alerts

Critical alerts: - Service unavailable - Key unavailable - Secret unavailable - High error rate (> 1%)

Warning alerts: - High latency - Unusual access patterns - Key rotation due - Secret rotation due


Change Safety

Key Changes

Creating Keys

Rotating Keys

Destroying Keys

Secret Changes

Creating Secrets

Updating Secrets

Rotating Secrets


Security Boundaries

Access Control

Encryption

At rest: - KMS keys: Stored in HSM (hardware security) - Secrets: Encrypted with KMS keys

In transit: - TLS: All connections use TLS - Encryption: Data encrypted in transit

Key Protection


Tradeoffs

Key Storage: Software vs Hardware HSM

Software HSM: - Pros: Lower cost, easier to use - Cons: Less secure than hardware

Hardware HSM: - Pros: More secure, compliance - Cons: Higher cost, more complex

Key Rotation: Automatic vs Manual

Automatic: - Pros: No manual intervention, consistent - Cons: Less control, may cause issues

Manual: - Pros: More control, can plan - Cons: Manual work, may forget

Envelope Encryption: Performance vs Security

Envelope encryption: - Pros: Better performance, secure - Cons: More complex

Direct encryption: - Pros: Simpler - Cons: Slower, less secure


Operational Considerations

Capacity Planning

KMS: - Keys: Plan for number of keys - Operations: Plan for operation rate - Scaling: Plan for scaling

Secret Manager: - Secrets: Plan for number of secrets - Access: Plan for access rate - Scaling: Plan for scaling

Monitoring & Debugging

Monitor: - Key/secret usage - Operation latency - Error rates - Access patterns

Debug issues: 1. Check KMS/Secret Manager health 2. Check key/secret access 3. Check IAM policies 4. Check error logs 5. Review access logs

Incident Response

Common incidents: - Key unavailable - Secret unavailable - High latency - Security alerts

Response: 1. Check service health 2. Check key/secret access 3. Check IAM policies 4. Rotate keys/secrets if compromised 5. Contact support if persistent


What Staff Engineers Ask in Reviews

Design Questions

Security Questions

Operational Questions


Further Reading

Comprehensive Guide: Further Reading: KMS & Secrets

Quick Links: - Cloud KMS Documentation - Secret Manager Documentation - Envelope Encryption - Key Rotation - Back to GCP Core Building Blocks


Exercises

  1. Design key management: Design a key management strategy for a multi-tenant application. What keys? How is rotation handled?

  2. Handle key compromise: A key is compromised. How do you respond? What's the recovery strategy?

  3. Optimize performance: Your application has high KMS latency. How do you optimize it? What's the strategy?

Answer Key: View Answers