Tag: End-to-end encryption

  • Years ago I built an application called Dashman. Customers used it to show web dashboards on screens around an office, a factory, a classroom, or a public space: Google Analytics, internal Grafana, shop-floor counters, org announcements, whatever they cared about as long as it was on web pages. This post is a retrospective on how that system was put together: the architecture, the handling of rendering jobs, and the end-to-end encryption that applied least privilege to every machine in the system, so that a compromise of the server or the public computers would expose only the data that machine needed to do its job, and nothing else.

    The naive approach, and why it fails

    The naive version of the job is easy to state: log into the websites and display them. One machine with a browser could, in principle, do all of it.

    That machine would hold everything: URLs, cookies, and obviously the rendered pages. In many offices the display sat in a lobby where anyone could walk up to it. In plenty of deployments, all you needed to compromise that computer was a USB keyboard. Plug it in and you were inside a computer that still had the customer’s session cookies in memory or on disk. The attack was simple.

    So the first move is architectural, separating rendering from displaying. Rendering needs cookies and a real browser. Displaying only needs the pixels, the rendered image. The machine under the public screen should not be the machine that logged into Google Analytics.

    The second move is end-to-end encryption, making it mathematically impossible to read the cookies anywhere they weren’t needed. That meant excluding both the Displayer and me as the operator, so my customers didn’t have to trust me with their precious cookies. The encryption was the part that I had the most fun with.

    The architecture

    The hosted side had several components, and the customer side had three applications. Together, the whole stack was a heterogeneous distributed system that looked like this:

    Configurator, Renderer, and Displayer on the customer side; Server, PostgreSQL, S3, and PubNub on the hosted side. Screenshots flow directly between clients and S3; the Server hands out presigned URLs.

    • Configurator: a desktop app the customer ran on an administrator’s machine. They logged into dashboard sources here (cookies originated here), chose what to display and where, and approved Renderers and Displayers.
    • Renderer: ran on a customer-controlled machine inside their network. It loaded configured URLs in a built-in browser and snapshotted them. The customer chose where this ran, such as a machine in a data center.
    • Displayer: ran on machines in public places, connected to the screens. It asked for screenshots at its resolution and showed whatever it got back.
    • Server: a stateless REST API that all the client components talked to.
    • PostgreSQL: durable store for accounts, sites, screenshot metadata, and the render queue.
    • S3: encrypted screenshots at rest.
    • PubNub: push channel. The Server published; Renderers and Displayers subscribed.

    A small deployment could collapse Configurator, Renderer, and Displayer onto one physical machine. Larger ones spread many Displayers across screens, ran several Renderers for capacity and redundancy, and kept a few Configurators on the administrators’ laptops.

    Cookies and two keys

    Everything sensitive had to be end-to-end encrypted. Cookies were the most critical piece of information because with them you could just log into all the websites. So I needed one key, which we’ll call the master key, to encrypt them end-to-end. So far so good, not that hard. With that key in place, the cookies on my hosted database were encrypted blobs that meant nothing to me.

    The screenshots that the Renderers generated also needed to be encrypted so that I couldn’t see them. But I couldn’t use the master key for this task because the Displayers couldn’t ever have it. A compromised Displayer should not expose the key that could decrypt the cookies. That meant a second key, the displayer key.

    This table might help:

    ConfiguratorRendererDisplayerBackend
    master key✅ readable✅ readable🚫 absent🔒 encrypted
    cookies✅ readable✅ readable🚫 absent🔒 encrypted
    URLs✅ readable✅ readable🚫 absent🔒 encrypted
    displayer key✅ readable✅ readable✅ readable🔒 encrypted
    Screenshots✅ readable✅ readable✅ readable🔒 encrypted

    This is just the beginning, though. Keeping these two keys encrypted on the Backend but distributed to Renderers and Displayers is where things get really interesting. More on that after a tour of the render loop.

    The render loop

    Dashman’s day-to-day work was a loop. A Displayer on a wall asked the Server for the best screenshot at its resolution. The Server tried the cache first: it picked among recently successful renders for that tenant, scoring candidates by how closely area and aspect ratio matched the request, by their freshness, and with some randomness.

    The goal there was that if you had 10 sites to display and 10 Displayers, you didn’t want to run 100 jobs when most of those would be pixel identical. But also, you didn’t want the 10 Displayers to show exactly the same thing in sync; variety was valuable.

    If nothing in the freshness window qualified, the Server enqueued a render job, told the Displayer to wait, and woke up the Renderers to start working. PubNub handled the fast path: the Server published to the renderer channel when a job was enqueued and to the displayer channel when a screenshot was ready. Renderers also polled on a slower cadence as a fallback. The database was authoritative and PubNub was a hint about when to look.

    After being woken up by a PubNub notification, all Renderers would try to claim a job. Claiming was an atomic row lock, and only one Renderer won. The order of processing was LIFO, not FIFO, because for displaying screenshots, freshness mattered. That meant a render job could be missed and never processed (under heavy load), but that was acceptable. If a Renderer crashed mid-render, the claim aged out and the row became available again. The claiming Renderer loaded the page, snapshotted it, uploaded the PNG to S3, and reported back. To reduce latency, the Server could auto-claim the next job at that point and assign it when responding to the Renderer. That way rendering jobs were chained together.

    Each site had two knobs the Configurator set: how long to wait after load before snapshotting (charts and fonts need time to settle), and how long a Displayer showed that screenshot before rotating. Both were per-site because a quarterly KPI and a minute-by-minute load report required different amounts of freshness and possibly different amounts of time to render fully.

    After all that, the Server notified the Displayer; the Displayer pulled the file from S3 and showed it on screen.

    Sequence: at the top, the Configurator saves a configuration to the Server, which fans the change out through PubNub to Renderer and Displayer; below, the steady-state loop where the Displayer asks the Server, the Server enqueues and publishes, the Renderer claims and uploads to S3, and the Displayer fetches from S3.

    The sequence diagram above shows two flows: an admin saving a configuration at the top (initial setup and occasional changes), then one full pass of the steady-state render loop below.

    I had sketched an alternative push path using WebSockets and SQS. I shipped PubNub instead because it was less code and less operational complexity while finding product-market fit. At higher scale, migrating would have been worth the savings.

    The render loop assumed each component could decrypt what it received and encrypt what it sent. What follows is the cryptography, in the order a real deployment came up: register user and tenant together, log in, change passwords, approve Renderers and Displayers, render with keys, revoke a decommissioned or compromised machine.

    Cryptography overview

    The user’s password was the root of trust. It never left the Configurator in plaintext. For authentication the Configurator ran SCrypt locally and sent the result to the Server, which ran SCrypt again to store an over-hashed password (a hash of a hash). At login it was similar, first hashing on the Configurator and then over-hashing the password to compare against the database. That split is generally a better way to do authentication than just sending a plain text password, but for Dashman it was a must. You will see why below.

    ConfiguratorRendererDisplayerBackend
    password⏱️ briefly in memory🚫 absent🚫 absent➖ absent
    hashed password⏱️ briefly in memory🚫 absent🚫 absent⏱️ briefly in memory
    over-hashed password⏱️ briefly in memory🚫 absent🚫 absent✅ readable

    During registration the Configurator created the two symmetric keys named above: master and displayer key. They were never stored or sent in plaintext to any hosted server.

    ConfiguratorRendererDisplayerBackend
    master key✅ readable✅ readable🚫 absent🔒 encrypted
    displayer key✅ readable✅ readable✅ readable🔒 encrypted

    To pass those keys through the hosted stack, each user also generated a public/private key pair using elliptic-curve cryptography. The master and displayer keys were encrypted with that public key, and that encrypted payload was stored on the server. Only the matching private key could decrypt them.

    ConfiguratorRendererDisplayerBackend
    user’s private key✅ readable🚫 absent🚫 absent🔒 encrypted
    user’s public key✅ readable🚫 absent🚫 absent✅ readable

    But now we had the problem of storing the user’s private key. We needed to encrypt that key and for that, the Configurator used PBKDF2 to create the password-key, a key derived from the same password the user entered to register or log in. Only someone who knew the password could re-derive that key on a new machine and recover the private key.

    ConfiguratorRendererDisplayerBackend
    password-key⏱️ briefly in memory🚫 absent🚫 absent🚫 absent

    If you are following along, the master/displayer keys get encrypted with the public key of the user, and the private key gets encrypted with the password-key. If you feel there’s one step too many there, you’d be right, until you see how Displayers and Renderers are enrolled. Hold on tight.

    This meant that to successfully log in and start operating the system from the Configurator app all you needed was a single password. But at the API level the Configurator needed two pieces of information: the hashed password (SCrypt, for authentication) and the password-key (PBKDF2, for decryption). Both were derived from the same plaintext password, but only the hashed password ever reached the Server.

    Now picture an attacker with persistent access to the Server: they can dump the database and watch incoming login traffic. The dump only yields the over-hashed password, which can’t be replayed to log in. Watching logins over time, however, lets them collect hashed passwords as users sign in, and that is where most apps would be leaking the plain text password. In Dashman, the hashed password still couldn’t decrypt anything: that needed the password-key, which was derived from the plaintext password and never touched the Server. That is what kept a Server compromise from exposing cookies or URLs.

    Each Renderer and each Displayer also generated their own public/private key pair locally, same as the Configurator. Each also generated a random password, kept on that machine, for authenticating to the Server. Unlike with the Configurator, these private keys never left the machine; the Server received only the public key. The credential and the private key lived and died with the machine: unlike users who could log in from anywhere, identity for Renderers and Displayers was tied to a specific machine and not transferable.

    ConfiguratorRendererDisplayerBackend
    a renderer’s password🚫 absent✅ readable🚫 absent🚫 absent
    a renderer’s private key🚫 absent✅ readable🚫 absent🚫 absent
    a renderer’s public key✅ readable✅ readable🚫 absent✅ readable
    a displayer’s password🚫 absent🚫 absent✅ readable🚫 absent
    a displayer’s private key🚫 absent🚫 absent✅ readable🚫 absent
    a displayer’s public key✅ readable🚫 absent✅ readable✅ readable

    The mechanism by which the master key got to a Renderer was that during enrollment that key was encrypted with the Renderer’s public key and stored on the Server. The Renderer fetched that record and decrypted it with its own private key. That is the payoff: the user’s elliptic-curve key pair exists so the master and displayer keys can be re-wrapped for each new Renderer or Displayer using their public keys, without the user’s password ever touching another machine. The password-key alone couldn’t have done that, because it only exists on the Configurator.

    The master and displayer keys were each put in a key ring: the master key ring and the displayer key ring respectively. That was because whenever a Renderer or Displayer was removed, new keys were generated and then rotated. For some time after rotation, data encrypted with the previous keys still needed to be readable, so all parts of Dashman could decrypt both old and new information.

    With all of this in place, we achieved the goal of cookies and URLs only present in the Configurator and Renderer, and screenshots only readable by the customer but not me, completing the full table of all information:

    ConfiguratorRendererDisplayerBackend
    cookies and URLs✅ readable✅ readable🚫 absent✅ encrypted
    screenshots🚫 absent✅ readable✅ readable✅ encrypted
    password⏱️ briefly in memory🚫 absent🚫 absent🚫 absent
    hashed password⏱️ briefly in memory🚫 absent🚫 absent⏱️ briefly in memory
    over-hashed password⏱️ briefly in memory🚫 absent🚫 absent✅ readable
    master key✅ readable✅ readable🚫 absent🔒 encrypted
    displayer key✅ readable✅ readable✅ readable🔒 encrypted
    user’s private key✅ readable🚫 absent🚫 absent🔒 encrypted
    user’s public key✅ readable🚫 absent🚫 absent✅ readable
    password-key⏱️ briefly in memory🚫 absent🚫 absent🚫 absent
    a renderer’s password🚫 absent✅ readable🚫 absent🚫 absent
    a renderer’s private key🚫 absent✅ readable🚫 absent🚫 absent
    a renderer’s public key✅ readable✅ readable🚫 absent✅ readable
    a displayer’s password🚫 absent🚫 absent✅ readable🚫 absent
    a displayer’s private key🚫 absent🚫 absent✅ readable🚫 absent
    a displayer’s public key✅ readable🚫 absent✅ readable✅ readable

    Registration

    Registration is the first ceremony, the moment when everything the rest of Dashman depends on comes into existence: the authentication material the Server will compare against on every future login, the elliptic-curve key pair the user will carry across machines, the symmetric rings that protect cookies and screenshots, and a single root for all of it (the password). It is the longest of the ceremonies because it has to bootstrap every kind of material at once. After it, the Configurator can fetch and decrypt the user’s data from any machine that knows the password, and the Server is left holding only ciphertext.

    Registration: the user enters a password, the Configurator hashes it and posts it for authentication, the Server creates the account and over-hashes, then the Configurator generates the cryptographic estate locally and uploads only the encrypted artifacts.

    Hashing the password

    The user enters a name, an organization, an email, and a password. The password never leaves the Configurator in plaintext. Before it leaves the machine at all, the Configurator runs it through SCrypt configured with this spec:

    ParameterValue
    AlgorithmSCrypt
    Cost (N)2^14 (16,384)
    Block size (r)8
    Parallelization (p)1
    Output length256 bits
    Salt256-bit random, unique per user

    SCrypt is a memory-hard password hashing function: each guess costs not only CPU time but also a sizeable block of RAM, which makes GPU and ASIC acceleration uneconomical for a brute-force attacker. In 2018 it was the strongest password hashing function available in Bouncy Castle, the cryptography library Dashman shipped on, and the cost parameter is tunable: as hardware gets cheaper, the cost goes up, and because the Configurator stores the spec alongside the result, future generations of the same password are not stuck at today’s setting. The per-user 256-bit salt prevents precomputation across the user base.

    The hash and the spec that produced it travel together in the registration request. The plaintext password does not.

    Storing nothing the Server can replay

    The Server does not store what the Configurator just sent. It runs SCrypt one more time over the incoming hash, with a fresh server-side spec, and stores the result. Call that the over-hash: it is the SCrypt of an SCrypt.

    Hash-then-store on the Server is standard. The unusual half is client-side: by SCrypting before sending, the Configurator keeps the plaintext password off the wire entirely. An attacker watching login traffic only ever sees hashes, not the password itself, and in a system like Dashman, where the plaintext password is what unlocks the rest of the cryptographic material described below, that distinction is the whole game. The client’s spec is stored alongside the over-hash so any Configurator can reproduce the same inner hash from the password later.

    Once the Server has committed the over-hash and created the account and tenant rows, it returns enough about the new user for the Configurator to continue. The Configurator now has an account on the Server, but the Server is not yet holding anything sensitive: no cookies, no rings, not even a public key.

    Generating the cryptographic estate

    Everything that protects sensitive data is built next, and it all happens on the Configurator before anything is sent back.

    First, the Configurator generates an elliptic-curve key pair on secp521r1. The curve choice is deliberate. secp521r1 is the largest of the standard NIST curves, and at the volumes Dashman ever expected (small numbers of agents per tenant, low frequency of cryptographic operations) the runtime cost of the larger curve is invisible. The Server will hold the public key and use it later to wrap material for the user; the private key will be the only thing on the network capable of unwrapping that material, and the Server will never see it without a layer of encryption around it.

    Next, it derives a 256-bit symmetric key from the password using PBKDF2 with this spec:

    ParameterValue
    AlgorithmPBKDF2 with HMAC-SHA-256
    Iterations65,536
    Output length256 bits
    Salt256-bit random, per-derivation

    We called the output the password-key. Both SCrypt (used at registration and login) and PBKDF2 run once per login on the Configurator, and both serve the same underlying purpose: making each guess at the password expensive enough that brute force is impractical. They guard different ciphertexts. SCrypt-derived material is what the Server stores as the over-hash and what an attacker would have to crack to recover the password from a database dump. PBKDF2-derived material is what wraps the elliptic-curve private key, where an attacker who got hold of that wrapped blob would need to brute-force the password through AES-GCM to read it. Within the cross-platform Java stack Dashman ran on, SCrypt (via Bouncy Castle) was the strongest password hashing primitive available, and PBKDF2 with HMAC-SHA-256 (via the JCE’s SecretKeyFactory) was the standard option for deriving a symmetric key from a password.

    The Configurator then encrypts the elliptic-curve private key under the password-key with AES in GCM mode (AES/GCM/NoPadding). The IV, the PBKDF2 salt, and the iteration count travel alongside the ciphertext when it is stored on the Server. The password never does. AES-GCM gives authenticated encryption, so a wrong password produces a GCM tag failure rather than silently decrypting into garbage that might look like a private key.

    Two empty rings are created next, the master ring and the displayer ring, and one random 256-bit AES key is added to each. The keys are pulled directly from the platform CSPRNG: there is no derivation, no input, just bytes. A ring is an ordered list of AES keys where the last entry is the current key (used for fresh encryption) and earlier entries are retained so that older ciphertexts remain decryptable. At registration, each ring has exactly one key; rotations described later append more.

    Finally, each ring is wrapped with the user’s elliptic-curve public key using Bouncy Castle’s ECIESwithAES-CBC profile, with a 128-bit nonce and a 128-bit MAC. ECIES, as a family, glues an ECDH agreement to a symmetric cipher and a MAC; the BC profile chosen was the straightforward bundled option at the time, rather than assembling the construction from primitives by hand. Before encryption, each ring is placed inside a small verifier envelope identifying the user and account it belongs to; on decryption the verifier must match or the operation fails. The verifier is what makes a misrouted blob (say, one user’s ring pasted into another’s row) fail closed instead of silently decrypting into something the application would try to use.

    At this point the Configurator holds in memory: the password (still, very briefly), the password-key, the elliptic-curve key pair, both rings, and the SCrypt hash of the password.

    It sends to the Server, in one PUT request: the public key, the ciphertext of the private key, the ciphertext of each wrapped ring, and the parameters needed to rebuild the password-key on a future login. The Server persists all of that as JSON it cannot decrypt.

    By the time the request returns, the password is dropped from memory. From the Server’s vantage, it is holding pieces of mathematics whose meaning is gated on a password it has never seen and on a private key it cannot reconstruct. From the Configurator’s vantage, it now has everything it needs to manage cookies and sites for this account, and a path to recover those materials on any other machine that knows the password.

    Logging in

    Login does two things at once: it authenticates the user to the Server, and it rebuilds the same in-memory state that registration produced. Those are independent code paths that happen to share a single input (the password) and a single transport (the Configurator’s HTTPS connection). If the first one succeeds and the second one fails, the user is signed in but cannot read anything.

    Login: the Configurator fetches the user's SCrypt spec from the Server, hashes the entered password with it, authenticates with HTTP Basic, then receives the encrypted key pair, the two encrypted rings, and the encrypted cookies and decrypts them locally.

    Authenticating

    Each user has their own SCrypt salt, generated at registration. The Configurator does not yet know it on a fresh install, so it asks the Server: given this email, what spec should I use? The answer is not secret. Without the password, the spec is useless. With the spec, the Configurator can reproduce the same SCrypt hash registration computed, regardless of which machine it runs on.

    The Configurator runs SCrypt with the fetched spec and forms HTTP Basic credentials of the form email:base64(hash). The Server runs SCrypt once more over the incoming hash with the spec that sits next to the over-hash, compares byte for byte, and either accepts or rejects. There is no separate session token; subsequent requests reuse the same Basic credentials until the Configurator clears them.

    If the spec the Configurator received is older than current defaults (say, the cost parameter has been raised since the account was created), the Configurator re-hashes the password with the current spec and uploads the upgraded hash on the next save. Work factors rise over the lifetime of the system without forcing anyone to reset their password.

    Recovering the keys

    Authentication only proves the password matches the stored over-hash. The Configurator still has to decrypt everything that follows. The Server returns, in the response to the authenticated GET, the encrypted elliptic-curve key pair, both encrypted rings, and the tenant’s encrypted cookies. The Configurator then:

    1. Reads the PBKDF2 spec stored alongside the encrypted private key and re-derives the password-key from the entered password. PBKDF2 with the same inputs yields the same key it produced at registration.
    2. Decrypts the elliptic-curve private key under the password-key. AES-GCM verifies that the ciphertext has not been tampered with, and a wrong password produces a tag failure rather than a corrupt private key.
    3. Decrypts each ring with the elliptic-curve private key, using ECIES. The verifier inside each ring’s ciphertext must match the expected user and account, otherwise decryption is rejected even if the algebra would have succeeded.
    4. Decrypts cookies under the master ring. The ring is tried current-key first, then older keys, until one verifies; that is how cookies encrypted before the last rotation remain readable.

    By the time the Configurator’s log in process finishes, its memory looks identical to the state at the end of registration, except that the Configurator did not generate any of these materials, it derived and decrypted them. The password is dropped; the password-key has done its job and is discarded with it.

    A practical corner of this design: a wrong password and a tampered private-key blob both manifest as the same AES-GCM tag failure. In practice this rarely caused confusion because the authentication step ahead of decryption already filtered out the common case (a mistyped password).

    The reason the elliptic-curve private key is wrapped under the password rather than stored only in an OS keychain is that Dashman was designed to recover on a fresh install: type the password, get everything back, no preloaded secrets needed. A keychain-only design would have hurt that experience and would not have changed the security story, since whatever the keychain held would still need an unlock secret tied to the user.

    Changing the password

    A password change in Dashman is, deliberately, the cheapest cryptographic operation the system performs. The hard work is concentrated at registration; from that point on the password unwraps exactly one thing (the elliptic-curve private key) and nothing else.

    Password change: the Configurator hashes the current password to authenticate, hashes the new password with a fresh spec, derives a new password-key, re-wraps the elliptic-curve private key under it, and sends the bundle to the Server.

    The flow only runs while the user is already logged in, which means the Configurator already holds the decrypted elliptic-curve private key and both rings in memory. It does not hold the original password (login dropped it), so the user has to enter the current password again to prove identity to the Server, plus the new password to provide fresh derivation input.

    The Configurator does, in order:

    1. SCrypts the current password with the stored spec to construct authentication credentials, the same way login does. This proves to the Server that whoever is asking for the change still knows the password the account was created with.
    2. SCrypts the new password with a fresh spec (new salt, current defaults). This is the hash the Server will over-hash and store going forward.
    3. Re-derives a new password-key from the new password with PBKDF2 and a fresh spec. The old password-key was never persisted, so there is no need to invalidate it; it just stops being useful once the new one is in place.
    4. Encrypts the elliptic-curve private key, which is sitting in memory in plaintext, under the new password-key with AES-GCM. The public key does not change. The rings do not change.

    The Configurator sends, in a single request: the new hashed password and its spec, the new ciphertext of the elliptic-curve private key (with its new PBKDF2 salt embedded), and any other profile fields the user edited. The Server first verifies the request using the current password’s Basic credentials, then over-hashes the new hashed password, stores it in place of the old one, and replaces the encrypted key-pair blob.

    The master and displayer rings are untouched. Cookies, site configurations, and screenshots already in S3 do not need to be re-encrypted: their keys live in the rings, which are themselves wrapped under the unchanged elliptic-curve public key. Re-encrypting any of that on a password change would be a lot of work for no security benefit, since the password only ever protected one specific layer of the cryptographic onion.

    A password change does not retroactively invalidate any other Configurator that previously decrypted the private key. Cryptographic invalidation of in-flight or cached material is what ring rotation is for, and that runs in a different ceremony, described later.

    Approving a Renderer

    Setting up Dashman means enrolling at least one Renderer and at least one Displayer. The two ceremonies share almost the entire shape, both in the UI the customer touches and in the sequence of network calls underneath; what differs is what each agent ends up holding. The Renderer is the fuller case: it receives both rings (master to decrypt cookies and site URLs, displayer to encrypt screenshots). I’ll walk through it first.

    Renderer approval: the Renderer generates its own key pair and a random password and connects unapproved; the user approves it in the Configurator, which wraps both rings under the Renderer's public key; PubNub notifies the Renderer, which fetches and decrypts the rings and the cookies.

    Connecting without keys

    A Renderer is launched on a machine the customer controls. On first start it has no account, has nothing to render, and does not know anyone’s password. The only input it has is a target: a slug or email the user types in that identifies the tenant it wants to join.

    Locally, before contacting anyone, the Renderer generates two things. First, its own elliptic-curve key pair on secp521r1, the same curve the user picked at registration but generated independently and never derived from the user’s. Second, a random password drawn from the platform CSPRNG, which the user never sees. That password is purely a credential for talking to the Server later; it is not used to derive any encryption material. Unlike the Configurator, where a wrapped copy of the private key is uploaded so the user can log in from any machine, a Renderer’s private key never leaves: both the credential and the private key live and die with this specific machine, and onboarding another Renderer means generating fresh ones on the new machine with no way to transfer the original identity.

    It sends to the Server: the tenant identifier, a display name (typically the machine’s hostname, useful when the user is looking at a pending Renderer in the Configurator and has to recognize it), the public half of the key pair, and the random password.

    The Server hashes the password once with SCrypt and persists a Renderer record marked unapproved. A single hash, rather than the double SCrypt used for users, is appropriate here because the password was generated by a CSPRNG and is not a stretching target for an offline guesser; the hash exists only so the Server is not storing the credential in clear. The Renderer is now visible on the Server, but the only material attached to it is its public key, its name, and authentication state. The Server then publishes a PubNub notification on the tenant’s user channel, which is how the Configurator learns there is a pending Renderer.

    Approving

    In the Configurator, the user sees the pending Renderer appear in the list and clicks Approve. This is the cryptographic step: it is the moment the user, who has the only copy of the unwrapped rings, decides that a specific Renderer’s public key is allowed to decrypt them.

    The Configurator wraps both rings, master and displayer, under the Renderer’s public key with ECIES, using the same ECIESwithAES-CBC profile and the same verifier envelope as everywhere else. The verifier here ties the wrapped material to the Renderer’s identifier and the account identifier, so a copy of one Renderer’s ciphertext cannot be reused by a different Renderer even if both belonged to the same tenant. The Configurator then PUTs the wrapped rings, with the Renderer marked approved, to the Server.

    Setting approved without wrapping the rings would be inert: the Renderer can authenticate either way once it has a password, but it cannot decrypt anything until the encrypted rings exist. Approval and key wrapping are bundled in the same request to keep the two from drifting out of sync.

    A subtle point worth pulling out. The user’s rings were already wrapped, at registration, under the user’s elliptic-curve public key. Approval wraps them a second time, under the Renderer’s elliptic-curve public key. The Server ends up holding two ciphertexts that decrypt to the same plaintext: one for recovery on a fresh Configurator (login on a new laptop), one for use on the Renderer. The Configurator only ever sends material wrapped under public keys; the user’s password never reaches the Renderer machine, and the Renderer’s private key never reaches the Configurator.

    Coming online

    The Server publishes a second PubNub notification, this time on the Renderer’s channel, saying the Renderer is now approved. The Renderer fetches its own record (now with encrypted rings populated), decrypts the rings under its own private key, decrypts the tenant’s cookies under the master ring, and from that point on can render pages.

    Approving a Displayer

    The ceremony for adding a Displayer looks almost identical to the one for a Renderer. The Displayer generates an elliptic-curve key pair and a random password locally, sends a connection request with its public key, the Server hashes the password and creates an unapproved record, the Configurator gets a PubNub notification, the user clicks Approve, the Configurator wraps a ring under the Displayer’s public key, the Server stores it, PubNub tells the Displayer, the Displayer decrypts.

    Displayer approval: same shape as the Renderer ceremony, but only the displayer ring is wrapped and shipped.

    But did you notice the difference?

    The Configurator wraps a ring (singular), not the rings. The master ring is not part of the payload, and on the Displayer side, the accessor that would return the master ring deliberately returns nothing. The Displayer’s reality is: it receives only one ring, it only ever decrypts one kind of payload (screenshots), and there is no path in the codebase by which a master key could end up in its memory.

    That asymmetry is the entire reason the Displayer exists as a separate component. A Renderer is a trusted machine in the customer’s network (a data center, an admin’s desk) that needs cookies in order to do its job. A Displayer is the machine wired up to a screen in a lobby or a hallway. Even if a stranger walked up to the lobby machine with a USB keyboard and dumped everything in memory, all they would get is whatever screenshots had recently been displayed and the key that decrypts a few more from S3. The decryption authority that could log into Google Analytics never touches that machine, and the Configurator has no way to send it there.

    The flip side of that constraint is that the Displayer cannot do anything with a screenshot until it has been encrypted under its ring. Which is what the next section is about.

    Rendering a screenshot

    Up to this point every ceremony has been about provisioning. Once Renderers and Displayers are approved, Dashman spends the rest of its life in a loop where bytes flow through the system and end up on a screen. The architectural separation from earlier and the cryptographic plumbing from the last few sections finally pay off together in this loop.

    Screenshot rendering: a Displayer asks the Server for a screenshot, a render job goes through PubNub to a Renderer which fetches encrypted site config, renders the page, encrypts the screenshot for the displayer ring, uploads it to S3, and the Displayer downloads and decrypts it.

    A Displayer asks the Server for the best screenshot at its current resolution. The Server consults its cache. If the cache cannot satisfy the request, it queues a render job for a fresh screenshot and publishes a notification on the renderer channel. The earlier render-loop section covered the cache and the queue; here the focus is on what is encrypted at each step.

    A Renderer wakes up, claims the newest job (LIFO so that a Displayer waiting on the screen sees fresh pixels first), and gets back from the Server two things: the encrypted site configuration (URL, per-site delay, anything else specific to the site), and a pair of presigned S3 URLs (one for upload, one for download), both valid for a day. The site configuration is encrypted under the current master key; the Renderer decrypts it by trying every key in its master ring until one verifies, in practice the current one.

    With the URL in hand, the Renderer loads it in an embedded WebView with the tenant’s cookies attached. After the page has loaded, the Renderer waits the per-site delay (some sites take seconds for charts and fonts to settle), snapshots the WebView, and serializes the result as PNG bytes.

    It then encrypts those bytes with the current displayer key, using AES-GCM, and attaches a verifier identifying the screenshot and the site the bytes belong to. On the Displayer side, decryption will reject any blob whose verifier does not match the screenshot the Displayer asked for. The resulting ciphertext, along with its IV and the small envelope around it, is uploaded to S3 via the presigned PUT URL. The Server never sees the bytes; the upload goes directly from the Renderer to S3.

    The Renderer then PUTs a small marker back to the Server reporting the job done. The Server flips the screenshot to rendered, publishes a notification on the displayer channel, and, as an opportunistic latency improvement, returns the next claimable job in the same response so the Renderer can chain into it without round-tripping through PubNub.

    A Displayer subscribed to the displayer channel receives the notification, fetches the screenshot metadata (which includes the presigned GET URL), downloads the ciphertext from S3, and decrypts it with its own displayer ring. The verifier check rejects any blob whose screenshot or site identifiers do not match what the Displayer asked for: a misrouted file fails closed instead of decrypting into something unexpected. Decryption succeeds, the Displayer hands the PNG to the screen rendering code, and the public sees pixels.

    Three things land at once at this point. The Renderer briefly saw cookies and a fully rendered web page, but they existed only in memory while it painted. The Displayer never saw cookies or a URL; it received an opaque blob, verified it, decrypted it, and showed it. The Server stored a record of which screenshots existed and pointers to where in S3 they lived, but it stored no plaintext; an operator browsing the production database, or an attacker who dumped it, would find ciphertext indexed by tenant. S3 held the ciphertext but could not read it. The end-to-end story holds.

    Decommissioning a Renderer

    Renderers and Displayers come and go. A data center contract ends, a machine is replaced, a Displayer in a lobby is suspected of having been physically tampered with. From a cryptographic point of view, the question is whether the new state of the system can be reached without invalidating the user’s password. The answer is yes, by rotating rings rather than rotating roots.

    Renderer removal: the Server deletes the Renderer record and notifies it via PubNub; the Configurator generates new master and displayer keys, re-wraps both rings for every remaining agent, re-encrypts site data, and saves the whole thing in one transaction.

    When the user removes a Renderer in the Configurator, the Configurator first asks the Server to delete the Renderer row. The Server deletes it and publishes a notification on the Renderer’s channel. If the Renderer is online, it receives the notification, logs out, and stops; its credentials no longer exist on the Server, so even if it tried to keep polling, the Server would reject it.

    Deletion alone is not enough. The removed Renderer kept whatever plaintext it had already extracted, and it kept its copy of both rings on local disk. If it is offline at the moment of deletion (the machine was stolen, the operator only knows it is missing), it will never receive the notification at all. Anything encrypted with the current master or displayer key from this point onward (future cookies, future site configurations, future screenshots) must be unreadable to the Renderer that just left.

    So the Configurator, immediately after the delete completes, performs the rotation:

    1. Generates a new random 256-bit AES key and appends it to the master ring. The previous master key stays in the ring; it has to, because cookies and site configurations encrypted before this moment are still in the database and still need to decrypt.
    2. Generates another fresh 256-bit AES key and appends it to the displayer ring, for the same reason: screenshots already in S3 are encrypted under the old displayer key and still need to be readable by the Displayers that survived.
    3. Re-wraps both extended rings under the user’s elliptic-curve public key, since the user themselves needs to recover the new state on the next login.
    4. For each Renderer that remains approved, re-wraps both rings under that Renderer’s public key with ECIES, replacing the previous ciphertext on the Server.
    5. For each Displayer that remains approved, re-wraps the displayer ring under that Displayer’s public key, replacing the previous ciphertext.
    6. Re-encrypts every site’s configuration (URL and cookies) under the new master key. Because encryption always uses the last key in the ring, anything saved from now on is under the new key only; the old key stays in the ring solely to decrypt historical artifacts.

    All of that, plus the deletion that prompted it, is sent to the Server as one PUT and committed in one transaction. Either every agent is updated and every site is re-encrypted, or the rotation is aborted; there is no window in which some agents have the new ring and others do not.

    The removed Renderer is now in a peculiar but desirable position. It still has its old copy of both rings, so any old ciphertext it kept around (a screenshot that happened to be in transit, a site configuration it cached before deletion) would still decrypt locally. But it has no credentials to fetch anything new from the Server, and even if it had a backdoor channel, every new screenshot in S3 is encrypted under a key it never received, every site configuration the Server holds is encrypted under a key it never received, and any new cookie the Configurator saves is encrypted under a key it never received. The decryption capability the Renderer kept is bounded by what it already had, not by what the system will produce going forward.

    Removing a Displayer follows the same routine. The Displayer never held the master key, so the master rotation is, strictly speaking, more than the threat requires; the displayer rotation is what matters. The codebase rotates both anyway because it is cheap, the routine is shared with Renderer removal, and rotating both leaves no current key the removed agent ever held in either ring. Simpler to reason about, no real cost at the volumes Dashman handled.

    Verifiers and wire formats

    A note about a detail that has shown up in every ceremony so far without much explanation. Every encrypted payload, symmetric or asymmetric, is wrapped in a small typed envelope before it is encrypted. The envelope holds the payload itself plus a verifier: a small object identifying what the payload is supposed to be. For user blobs the verifier carries the user identifier and the account identifier; for renderer and displayer blobs, the agent and account identifiers; for screenshots, the screenshot identifier and the site identifier. The envelope is serialized to JSON, encrypted as one piece, and on decryption the verifier inside the plaintext must match the verifier the caller expected. If it does not, decryption raises a verification failure rather than returning the bytes.

    This is not a digital signature. It does not prove the payload came from a particular party. What it gives instead is type-level safety inside the ciphertext: a payload meant for one tenant cannot be misrouted to another and silently decrypt into something the application would try to use. Crossing tenants is the kind of bug a system like this should not tolerate as silent corruption; verifier mismatches make it loud.

    For asymmetric wrapping the system uses Bouncy Castle’s ECIESwithAES-CBC with a 128-bit nonce and a 128-bit MAC. The curve choice (secp521r1) was the largest NIST curve with a mature ECIES profile in Bouncy Castle. The bigger curve hedges against future cryptanalysis at a runtime cost invisible at Dashman’s volumes. Assembling ECIES from primitives by hand would have been more work for no real gain at the time.

    For symmetric encryption the system uses AES in GCM mode (AES/GCM/NoPadding) with a per-payload random IV. GCM gives authenticated encryption (a tag that fails closed on tampering), which composes cleanly with the verifier envelope: a wrong key produces a GCM tag failure, a right key over the wrong tenant’s payload produces a verifier mismatch, and both are surfaced as decryption failures with no plaintext returned.

    I would revisit the curve and the specific ECIES profile if I were starting again today; standards drift and library defaults have moved on. The compartmentalization story (which keys live where, who can wrap material for whom, and what each component is mathematically prevented from reading) would stay the same regardless.

    That was fun

    Building Dashman was a lot of fun. Thinking through how it could be attacked, from lobby USB keyboards to a malicious operator rifling through S3, was fun in the same way hard puzzles are. Putting it into production and watching real customers use it was satisfying: the design decisions held up under real configuration mistakes, flaky networks, and Renderers disappearing mid-job, not only under the load tests I ran myself.

    I do wish it had seen more sustained load in the wild. It did get real usage, just not the “millions of screens” scale that would have stress-tested every corner of the queue and cache behavior beyond what I could simulate. Still, for a retrospective on a system I built years ago, that is a good problem to have.