Episode 40 — Justify Cryptography Choices by Data Sensitivity and Risk
Encryption at rest only earns its keep when the keys are stored securely, separated from access paths, and managed with envelope patterns that compartmentalize risk. Envelope encryption encrypts data with a data encryption key, which is itself wrapped by a key encryption key stored in a hardened service such as a Hardware Security Module (H S M) or a managed key vault with strict role separation. Access to data should not automatically grant access to keys; operators who can read storage should not be able to unwrap the key material that protects it. Document how keys are generated, where they live, which services can request decrypt operations, and which audit logs prove those requests were authorized. When disks are decommissioned or snapshots copied, the design should allow you to cryptographically render data unrecoverable without trusting a single physical destruction process to go perfectly every time.
Protecting data in motion starts with the protocol and ends with who is authenticated to talk to whom. Transport Layer Security (T L S) with modern cipher suites, properly validated certificates, and Perfect Forward Secrecy should be mandatory for external and internal communications alike. Where mutual trust is required—service-to-service calls within sensitive zones, administrative interfaces, machine-to-machine control paths—mutual authentication via client certificates or strong token-based mechanisms is appropriate, with revocation and rotation that match the risk window. Avoid weak protocol versions and negotiate away legacy ciphers; pin where feasible to narrow exposure to misissued certificates. The proof point is simple: can we show, with captures and configuration proofs, that all high-value paths are encrypted end to end, both peers are authenticated to the required assurance level, and session compromise does not reveal past traffic.
Cryptographic agility is your insurance against the future. Version formats and protocols so fields can carry “alg: v2” without breaking consumers, and keep parsers liberal in what they accept within policy, while emitters are strict in what they produce. Maintain deprecation playbooks that name the suites to retire, the telemetry to confirm usage, the steps to migrate consumers, and the rollback plan if an unforeseen dependency appears. Track external deprecations—browser policies, operating system changes, library end-of-life—and tie them to internal deadlines so you are never surprised by a certificate type or cipher suddenly rejected in production. The question to answer is not whether change will be needed, but whether you can change without outage or security debt when the moment arrives.
Validation closes the loop between policy and reality. Unit and integration tests should prove that encryption, decryption, and signature verification behave as expected across known-good and known-bad cases; proofs of possession should confirm that entities claiming a private key can perform required operations; and configuration checks should compare live settings to your standards—protocol versions, cipher lists, minimum key sizes, certificate chains—on a schedule. For critical paths, run adversarial tests: reject self-signed impostors, detect downgrade attempts, and verify that revocation or short-lived expiry prevents reuse. Store results as evidence alongside change tickets and architecture decisions so audits and future engineers can understand not only what you intended but what you actually proved.
Monitoring must treat cryptographic failures with urgency because they often hide behind “it worked yesterday” symptoms. Alert on expiring or expired certificates before customers discover them; on the appearance of weak ciphers or protocol versions at edges where policies should prevent them; on key reuse outside approved contexts; and on decryption or signature verification failures that exceed calibrated noise. Watch secrets managers for anomalous access patterns and key vaults for privileged operations at odd hours. Tie alerts to runbooks that name the temporary safe state—deny by default, rotate the key, switch to a backup certificate authority—and the approval path for high-impact moves. Crypto mishaps are reliability events with security consequences; handle them with the same production rigor as an outage.