Task 3 — Threat model review against current implementation

Date: 2026-05-11 Reviewer: Cho García Scope: crypto/spec/threat-model.md and the integration mechanisms that implement each claimed protection across @shieldfive/crypto (vendored at 1.0.0-alpha.3, repo HEAD a5e99ac), the web app (web HEAD bfa41a57), the Supabase schema (web/supabase/migrations/), and the B2 direct-upload / share-download server boundaries. Methodology: read the threat model end-to-end without consulting code; build a coverage table for every protected and excluded threat; trace each protected threat to the concrete file:line that enforces it; verify each excluded threat remains excluded under the integration as shipped. Tasks 1 and 2 findings were treated as inputs (Finding 1.1's "file_id is structurally bound via HKDF salt, not AAD content" materially changes how A2's splice-resistance claim should read).

Status note. This document records the review as performed on 2026-05-11. For current finding-level status (Fixed / Open / Accepted) including PR links for shipped fixes, see triage.md. The per-task status indicators in this document reflect state at review time and are not maintained after publication.

Summary

The threat model holds up well at its scoped claims. A1 (honest-but-curious server), A2 (active network adversary), and A3 (future quantum adversary against suite-0x03 files) each have concrete mechanisms in the code that deliver the documented protections, modulo the "spec describes a property; implementation provides it via a slightly different layer" pattern Task 1 already documented. Confidentiality against A1 holds: every key that could decrypt user data is wrapped under material the server never sees. v1 integrity properties (truncation, reorder, cross-file splice) are delivered, with cross-file splice resistance coming from HKDF structural binding rather than from file_id-in-AAD as the threat model prose claims.

The findings cluster around two gaps the threat model doesn't currently flag:

The threat model documents an A3 protection that the deployed system does not yet provide. The PQ-hybrid suite (0x03) the threat model uses to defend against "harvest now, decrypt later" is implemented in the crypto library but is not wired into the web app's worker — the worker only imports the AES-GCM-v1 stream and writes suite 0x01. Every file the web client encrypts today is classical-only. The threat model's A3 paragraph is technically correct (it scopes the claim to suite-0x03 files) but the comparison table and natural reading both imply available-today PQ protection.
The threat model presents the v1 integrity invariants (truncation/reorder/splice) as the integrity story for ShieldFive, but production data is overwhelmingly v0 — the v1 writer flipped on via NEXT_PUBLIC_CRYPTO_V1_ENABLED only days before this review (see web/utils/cryptoV1Flag.ts:15-16). v0 files have none of those AEAD-bound invariants per crypto/spec/format-v0.md § "Limitations of v0".

Highest-impact findings:

3.1 — Threat model A2 says "Splice chunks across files (file_id AAD)" but the implementation provides cross-file splice resistance via HKDF salt = file_id, not via AAD content. Same shape as Task 1 Finding 1.1 in a different document. Medium.
3.2 — A3 protection is documented as available via suite 0x03; the web worker only wires up the AES-GCM-v1 suite. Every file the web client encrypts today is classical-only and subject to harvest-now-decrypt-later against A3. Medium.
3.3 — Threat model presents v1 invariants as ShieldFive's integrity story without acknowledging that the bulk of live data is still v0, which has no chunk AAD, no file_id binding, and no truncation detection. Medium.

No Critical or High findings. No claimed confidentiality property is broken, and no out-of-scope threat is reachable under realistic conditions where the threat model said it wouldn't be.

Threat coverage table

The table below enumerates every threat the spec mentions — protected-against and excluded — and identifies the concrete mechanism (or absence thereof) in the code as shipped. Findings flag where prose and mechanism diverge.

Protected threats

Adversary	Threat claim	Mechanism (file:line)	Status
A1	Cannot recover plaintext file content	All file bytes encrypted client-side before transmission. v0 path: legacy AES-GCM worker; v1 path: `web/utils/sfCryptoWorkerImpl.ts:232-376` (`initEncryptV1` + `processEncryptV1`). Server-side endpoints only proxy ciphertext bytes or issue B2 upload-part URLs — see `web/app/api/files/upload-part/route.ts:243-288`.	Holds
A1	Cannot recover unwrapped content keys	`csk_wrapped` is AES-GCM-wrapped under the parent folder/root key; the server stores `csk_wrapped` + `csk_iv` and the wrapping key is never on the server. Wrap helper: `web/utils/keyCrypto.ts:52-68`. Vault wraps: `web/utils/vaultKeyClient.ts:100-147`.	Holds
A1	Cannot recover plaintext filenames or folder names	Names encrypted client-side via Argon2id-derived AES-GCM, `web/utils/metadataClient.ts:270-313`. Server sees only the v4 ciphertext payload + a name_hash (HMAC-SHA-256 under SHA-256(rk); A1 cannot compute candidates without rk).	Holds (3.6 documents add'l metadata not in spec list)
A2	Cannot recover plaintext (TLS + AEAD)	TLS terminated at the Vercel/Cloudflare edge; AEAD layer is end-to-end client-to-storage. v1 chunks use AES-GCM-256 with 16-byte tag, AAD bound to chunk_idx/total/is_final (Task 1 § "v1 chunks (AES-GCM-v1)").	Holds (3.8 documents TLS boundary)
A2	Cannot inject ciphertext that decrypts to attacker-chosen plaintext	AEAD tag verification on every chunk before plaintext emission (`crypto/src/streams/aes-gcm-v1.ts:340-348`); header MAC verified before any chunk processing (`crypto/src/format/header.ts:201-214` and Task 1 Finding 1.4a/b/c).	Holds
A2	Truncate a file undetectably (final-chunk AAD `is_final` flag)	`is_final` byte in AAD (`crypto/src/format/header.ts:219-236` `buildChunkAad`). Belt-and-braces: `plaintext_size` is HMAC-bound in the header and `bytesEmitted === plaintextSize` is checked at flush (`crypto/src/streams/aes-gcm-v1.ts:399-401`).	Holds for v1 (3.3 covers v0 omission)
A2	Reorder chunks undetectably (chunk-index AAD)	Chunk index `i` encoded as `uint64_be` in AAD, AAD is fixed-length 36 bytes (Task 1 § "v1 chunks"). IV/nonce also includes the chunk index so reorder produces wrong nonce as well — double defense.	Holds
A2	Splice chunks across files (`file_id` AAD)	Cross-file splice resistance is provided structurally: HKDF salt = `file_id` for `chunk_key`, and HKDF IKM = `file_id` for `nonce_prefix`. A spliced chunk decrypts under the splice target's `chunk_key`/`nonce_prefix`, which is wrong, and AEAD fails. The threat model's parenthetical "file_id AAD" is incorrect — file_id is NOT in chunk AAD (Task 1 Finding 1.1, same shape).	Holds; prose wrong — Finding 3.1
A3	Cannot recover plaintext from files encrypted with suite `0x03`	Library implementation exists at `crypto/src/suites/pq-hybrid-v1/index.ts`, `crypto/src/suites/pq-hybrid-v1/api.ts`. ML-KEM-1024 + XChaCha20-Poly1305 hybrid with combined key K derived via HKDF. Not wired into the web worker (`web/utils/sfCryptoWorkerImpl.ts:37` imports `streams/aes-gcm-v1` only).	Library: holds; deployment: not delivered — Finding 3.2
A3	(Implicit) Files encrypted with `0x01`/`0x02` are NOT A3-protected	Documented honestly in threat model § A3. Confirmed: current web writer emits `0x01` only.	Holds

Excluded threats (out of scope)

#	Threat	Stated rationale	Still excluded under integration?
1	Malicious client delivery (backdoored JS bundle)	Crypto can't detect a backdoored bundle — host application responsibility.	Yes. CSP is enforced (not report-only) in production via `web/middleware.ts` and validated at boot (`web/utils/validateEnv.ts:54-57`). CSP is the host's mitigation (out of crypto-layer scope); the crypto layer itself doesn't claim defense.
2	Endpoint compromise (malware / keyloggers)	Plaintext exists on the user's device — crypto cannot help.	Yes. No integration change has moved plaintext off the device (e.g., no server-side key escrow, no plaintext caching in third-party services). Master-key in-memory handling deferred to the separate server-side review.
3	Metadata leakage at storage layer	File sizes, timestamps, folder cardinality visible.	Yes, but the spec's three-item list is incomplete. Per-share `viewer_country` (geo header), HMAC-hashed IP, per-share download counts, `share_password_attempts` lockout state, deterministic per-user `name_hash` (HMAC-SHA-256 under SHA-256(rk)), and aggregate storage usage in `subscriptions.last_active_at` are also visible to anyone with DB access. Spec list should be expanded — Finding 3.6.
4	Side channels in WebCrypto / WASM	Power analysis / fault injection on browser/runtime out of scope.	Yes. No constant-time guarantees claimed for non-AEAD primitives. Argon2id timing via libsodium is data-independent for the salt/output paths; bcrypt for share password is server-side (out of crypto-layer scope).
5	User key loss	If password and recovery key are both lost, data is unrecoverable.	Yes. Both password and recovery key paths independently unwrap `rk` (`web/utils/keyring.ts:293-305` for recovery). Threat model does not explicitly address recovery-key theft as an attack surface — Finding 3.4.

Threats not in the spec adversary model

#	Threat	Notes
—	Share-link recipient holds partial decrypt authority for one file	The threat model has no notion of a "share-link recipient" as a partial-trust party. Anyone with the share URL plus the share password decrypts exactly one file; no further reach into the owner's vault. Server-side bcrypt + per-share lockout limits brute force (out of crypto-layer scope). Worth enumerating in the threat model — Finding 3.5.
—	Recovery-key theft (separate from password compromise)	Independent path to `rk` not flagged as a distinct adversary. Same access as password-only theft, but the user-facing storage surface (where the recovery key is written down) is different and may be less protected — Finding 3.4.
—	v0 legacy files (no AEAD-bound integrity)	v0 production data has no `file_id` binding, no chunk AAD, no truncation detection (see `crypto/spec/format-v0.md` § "Limitations of v0"). Defense-in-depth via stored `cipher_parts_sha1` is application-layer, not AEAD. v1 writer flag flipped only days ago, so the bulk of live data is v0. Threat model presents v1 invariants as universal — Finding 3.3.

Findings

Finding 3.1 — Threat model A2 describes cross-file splice resistance as "file_id AAD" but the mechanism is HKDF salt = file_id

Severity: Medium Affects: v1 Component: spec (crypto/spec/threat-model.md)

Description. Threat model § "A2 — Active network adversary" lists:

"Splice chunks across files (file_id AAD)."

This is the same misstatement Task 1 Finding 1.1 flagged in crypto/spec/format-v1.md and that landed in crypto#1: file_id is NOT in chunk AAD. The chunk AAD is exactly 36 bytes (crypto/src/internal/types.ts:90, CHUNK_AAD_LENGTH = 19 + 8 + 8 + 1 = 36) and consists of the domain string, chunk_idx, total_chunks, and is_final only — crypto/src/format/header.ts:219-236 (buildChunkAad).

The cross-file splice resistance the threat model claims IS present, but structurally rather than via AAD content. Two layers:

chunk_key = HKDF(content_key, salt=file_id, info="shieldfive/v1/<suite>/chunk-key", L=32) — crypto/src/suites/aes-gcm-v1/index.ts:68 (and analogues in xchacha-v1, pq-hybrid-v1). Different file_id → different chunk_key → AEAD fails on a spliced chunk.
nonce_prefix = HKDF(file_id, salt=zeros(32), info="shieldfive/v1/<suite>/nonce-prefix", ...) — crypto/src/suites/aes-gcm-v1/index.ts:80. Different file_id → different IV — AEAD fails.

The defense holds; the threat model's parenthetical pointer is just wrong about where it lives. Same severity rationale as 1.1: an independent implementer or auditor who reads the threat model and expects to find file_id bytes inside chunk AAD bytes will be confused and may "helpfully" patch it in, breaking wire compatibility.

Reproduction or evidence. Read crypto/spec/threat-model.md § "A2 — Active network adversary" line: Splice chunks across files (file_id AAD). Compare against crypto/src/format/header.ts:219-236 and crypto/src/internal/types.ts:90.

Recommendation. Spec change only; bundle with the next threat-model refresh. Replace:

"Splice chunks across files (file_id AAD)."

with:

"Splice chunks across files (chunk_key and nonce_prefix are derived from file_id via HKDF; see spec/format-v1.md § 'Suite payloads' for each suite's derivations. file_id is NOT in chunk AAD; the binding is structural)."

Cross-link to Task 1 Finding 1.1 (and the test vector added in crypto#1) so the threat model doesn't drift again.

Finding 3.2 — A3 protection is documented via suite `0x03` but the web app does not yet emit `0x03`

Severity: Medium Affects: v1 (PQ-hybrid suite); spec; web integration Component: spec (crypto/spec/threat-model.md) + web integration

Description. Threat model § "A3 — Future quantum adversary":

"A3 cannot recover plaintext from files encrypted with suite 0x03 (pq-hybrid-xchacha-mlkem1024-v1)..."

And the comparison table positions the system as:

"ShieldFive (v1, suite 0x03) | XChaCha20-Poly1305 + ML-KEM-1024 hybrid"

The A3 paragraph is technically correct — it scopes the claim to suite-0x03 files. The implementation in the library supports the suite end-to-end:

ML-KEM keypair derivation: crypto/src/suites/pq-hybrid-v1/index.ts:117-126 (deriveMlKemKeypair).
Encapsulation, combined-key HKDF, and chunk key derivation: crypto/src/suites/pq-hybrid-v1/index.ts:138-178.
Whole-blob entry points: crypto/src/suites/pq-hybrid-v1/api.ts:168-186 (per Task 1 Finding 1.4c, decapsulation-before-MAC is structurally required and now spec-blessed in crypto#3).

The web app does not produce suite-0x03 files today. The worker imports only the AES-GCM-v1 stream:

// web/utils/sfCryptoWorkerImpl.ts:34-37
import {
  createAesGcmV1DecryptStream,
  createAesGcmV1EncryptStream,
} from '@shieldfive/crypto/streams/aes-gcm-v1'

The 72-byte suite_payload override at web/utils/sfCryptoWorkerImpl.ts:54 (V1_AES_GCM_SUITE_PAYLOAD_LENGTH = 72) hard-codes the AES-GCM-v1 suite. The encrypt modal selects between v0 and v1 via the CRYPTO_V1_ENABLED flag (web/utils/cryptoV1Flag.ts:15-16) but neither branch produces suite 0x03. The upload API rejects neither — cipher_version is just 1 (v0) or 2 (v1) in web/app/api/files/create-upload-session/route.ts:106-115, with no PQ-hybrid pathway yet.

Net effect: every file the production web app encrypts today is classical-only and subject to "harvest now, decrypt later" against A3.

Why this is Medium, not Low:

The natural reading of the threat model misleads. The comparison table column is "ShieldFive (v1, suite 0x03)" with no "current deployment" qualifier; a privacy-conscious user or an external auditor would conclude the deployed product encrypts to 0x03 today. Task 2 explicitly noted "PQ-hybrid; not yet wired into the web app" but the threat model doesn't.
The A3 sentence "Files encrypted with classical-only suites SHOULD be re-encrypted with the hybrid suite when migration is feasible" implies an asymmetric posture (today classical, tomorrow PQ). The actual state is symmetric: today and tomorrow classical, until Phase 3 lands and the web worker is updated.

Why this is Medium, not High:

The threat model statement itself, narrowly read, is not false. Suite-0x03 files (when they exist) ARE A3-protected.
The mitigation is a documentation update + Phase 3 delivery; classical confidentiality against A1/A2 is unaffected.

Reproduction or evidence. Two grep-anchored facts:

$ grep -rn "createPqHybridV1\|deriveMlKemKeypair" web/
# (no hits in production code)

$ grep -n "^import" web/utils/sfCryptoWorkerImpl.ts | head
# imports streams/aes-gcm-v1 only

And read crypto/spec/threat-model.md § "A3" + § "Comparison with comparable products" side by side with the worker imports.

Recommendation. Two changes to the threat model, bundled with the next refresh:

Add a "Current deployment status" subsection under § "A3":

"As of 2026-05-11, the production web client emits suite 0x01 (aes-256-gcm-v1) for new uploads. Suite 0x03 (pq-hybrid-xchacha-mlkem1024-v1) is implemented in the @shieldfive/crypto library but is not yet wired into the web worker; until Phase 3 ships, all newly-uploaded files are classical-only and are NOT protected against A3. This document will be updated when 0x03 becomes the default writer suite."
Add a footnote to the comparison table's ShieldFive row:

"† Suite 0x03 is available in the crypto library (1.0.0-alpha.3) and is the format-v1 spec default; the production web client has not yet been updated to emit it. See § 'A3 — Current deployment status'."

Track as a Phase-3 documentation-prerequisite alongside the existing Phase-3 wire-up work.

Finding 3.3 — Threat model omits v0 acknowledgement; v0 production data lacks v1's AEAD-bound integrity invariants

Severity: Medium Affects: v0 (legacy production format) — most live data as of 2026-05-11 Component: spec (crypto/spec/threat-model.md)

Description. crypto/spec/format-v0.md § "Limitations of v0 (addressed in v1)" enumerates exactly the properties the threat model § A2 claims as guarantees:

"No truncation detection. Trailing chunks can be removed and the reader will produce a valid (truncated) plaintext. v0 deployments detect this at the application layer using cipher_parts_sha1, not in the AEAD."

"No file_id binding. Chunks can be spliced between two files encrypted with the same content key..."

"No AAD on chunks. Chunk index, total chunks, and is-final flag are not authenticated. Reordering is detected only because the IV depends on the index, but reordering does not produce a clear error message."

Production-data demographics as of this review:

The v1 writer is gated by NEXT_PUBLIC_CRYPTO_V1_ENABLED — web/utils/cryptoV1Flag.ts:15-16:

"Default OFF: anything other than the literal string 'true' (unset, empty, false, '1', etc.) keeps the v0 writer engaged."
The cipher_version column was extended to accept 2 (= v1 wire format) via migration 20260509213345_extend_cipher_version_for_v1.sql on 2026-05-09 (two days before this review). Migration 20260510211958_finalize_upload_accepts_v1_proof.sql landed 2026-05-10. v1 production traffic has had at most ~48 hours to accumulate.

The threat model presents:

"A2 cannot: ... Truncate a file undetectably (final-chunk AAD is_final flag). Reorder chunks undetectably (chunk-index AAD). Splice chunks across files (file_id AAD)."

as the integrity story without qualification. For v0 data, all three defenses are absent — defense relies on application-stored cipher_parts_sha1 (Task 6 scope) and on the structural fact that v0 files use distinct random IV prefixes per file (which limits same-key cross-file splice to the same-prefix degenerate case that should never occur under correct use).

Why this is Medium, not Low:

Most live data is v0. A user reading the threat model and concluding "my ShieldFive files are AEAD-protected against truncation" would be wrong about ~all of their existing files and right only about uploads from the last ~48 hours.
The v0 → v1 migration story has no timeline in any of the spec docs. Migration is the only way live data picks up these invariants; the threat model should at least name it.
Defense-in-depth via cipher_parts_sha1 is application-layer, not AEAD-layer. A different review (Task 6) covers whether cipher_parts_sha1 is actually checked on every download. Regardless of that result, the spec should not claim AEAD-bound invariants for files that don't have them.

Reproduction or evidence. Three documents read in sequence:

crypto/spec/format-v0.md § "Limitations of v0".
crypto/spec/threat-model.md § "A2".
web/utils/cryptoV1Flag.ts + migration filenames in web/supabase/migrations/ (search for cipher_version to see the v1 introduction date).

Recommendation. Add a "Legacy v0 data" subsection to the threat model immediately after the A2 enumeration:

"Legacy v0 data. Files encrypted with the v0 wire format (predating this specification — see spec/format-v0.md) do NOT carry the AEAD-bound integrity properties above. Specifically, v0 files have no chunk AAD, no file_id binding, and no truncation detection. Defense-in-depth for v0 files relies on the application's stored per-chunk SHA-1 hashes (cipher_parts_sha1), verified on download. Migration to v1 is required to gain the AEAD-bound integrity properties documented above. As of 2026-05-11, the v1 writer is enabled but most live data is still v0; migration policy and timeline are tracked separately."

This change is purely documentation; the v0 reader's cipher_parts_sha1 check itself is Task 6 scope.

Finding 3.4 — Recovery-key compromise is not in the adversary model

Severity: Low Affects: integration Component: spec (crypto/spec/threat-model.md)

Description. The threat model's adversary list (A1, A2, A3) does not address an attacker who has obtained the recovery key but not the password. The system supports two paths to recover rk:

Password → PBKDF2(password) (or Argon2id, per Task 2 Finding 2.1) → unwrap rkWrappedByUk → rk — web/utils/vaultKeyClient.ts:100-147.
Recovery key (32 random bytes shown to the user once) → unwrapRootKeyWithRecovery(recoveryKey, rkWrappedByRec, recIv) → rk — web/utils/keyring.ts:293-305.

A recovery-key-only attacker gets the same access as a password-only attacker (everything). The threat model says "user key loss" is out of scope but doesn't address "recovery key theft", which is functionally equivalent to password compromise via a different storage surface.

This matters because the operational storage surface for the recovery key is typically different from the password. Users often back the recovery key up to an email, a notes app, a screenshot, or paper — each of which has a different threat profile than the password (which is typically only in the user's head and possibly a password manager).

Reproduction or evidence. Read the adversary list in crypto/spec/threat-model.md § "Adversary model" — no mention of recovery-key compromise. Then trace the recovery path in web/utils/keyring.ts:293-305 to confirm it is functionally equivalent to the password path.

Recommendation. Add a paragraph to § "Out of scope" or § "Adversary model":

"Recovery-key compromise. The recovery key is a 32-byte random value generated client-side at signup and shown to the user exactly once (regeneration is supported via in-app UI). Possession of the recovery key is functionally equivalent to knowing the password — it unwraps the same rk via a different wrapping key. The threat model treats recovery-key theft as out of scope (the host application's recovery-key UX, covered by the separate server-side review, is responsible for guiding users to a secure backup channel); confidentiality against A1 still holds for users who back the recovery key up to a non-leaked location. If a user backs the recovery key up to a leaked location (email plaintext, screenshot in cloud-synced photos, paper in an insecure environment), an attacker reaching that backup obtains the same access as the user."

Finding 3.5 — Share-link recipients are a partial-access party the threat model doesn't enumerate

Severity: Low Affects: integration Component: spec (crypto/spec/threat-model.md)

Description. The threat model defines three adversaries (A1, A2, A3) and one trust principal (the user). It does not enumerate share-link recipients as a partial-access party. A recipient with the share URL plus the share password can:

Trigger an authenticated download of one file via web/app/api/share/[slug]/route.ts:54-422.
Consume one decrement of share_max_downloads (atomic via reserve_share_download RPC).
Decrypt that single file's plaintext: the password unwraps csk_pw_wrapped via PBKDF2 (or Argon2id) on the client, the server-side verify-password flow at web/app/api/share/[slug]/verify-password/route.ts gates delivery of those key materials.
Never reach rk, sibling/parent folder keys, other files' CSKs, or the folder tree — those materials are not delivered to share endpoints.

If the share password is the auto-share password (derived from SHA-256("shieldfive:auto-share:v1:" || rk_b64) truncated to 24B — web/utils/sharePassword.ts:18-32), the share password is high-entropy (~144 bits after the truncation). If the password is user-chosen, brute force against bcrypt-wrapped share_password_hash is rate-limited per-share + per-IP + per-burst (out of crypto-layer scope).

The implementation handles this correctly. The threat model just doesn't enumerate the recipient as a distinct party with bounded access. An auditor asking "what's the worst a malicious share-link recipient can do?" deserves an explicit answer.

Reproduction or evidence. Read the share download path at web/app/api/share/[slug]/route.ts side by side with the threat model's adversary list (which contains no mention of "share recipient").

Recommendation. Add a § "Share-link recipients" subsection under § "Adversary model" or § "Trust principals":

"Share-link recipient. A third party in possession of a share URL and the share password is given decrypt authority for exactly one file. Their reach is bounded by:

One file. Each share is scoped to a single file_id; share URLs do not enumerate folder contents or sibling files.

Bounded downloads. share_max_downloads is enforced atomically.

Optional geo-lock and expiry. Per-share allowed countries and expiry timestamp.

No key reach. The share endpoint never returns rk, parent folder keys, or other CSKs.

Brute force. Bcrypt-wrapped share_password_hash is rate limited per-share, per-IP, and per-burst; lockout state is tracked in share_password_attempts (out of crypto-layer scope).

The recipient is a trusted third party for the single file shared; the system makes no confidentiality guarantee against that recipient. Confidentiality against A1 is unaffected (the server doesn't have the share password in cleartext; it has only the bcrypt hash)."

Finding 3.6 — "Metadata leakage at the storage layer" understates what the deployed schema exposes

Severity: Low Affects: integration Component: spec (crypto/spec/threat-model.md)

Description. Threat model § "Out of scope / Metadata leakage at the storage layer":

"File sizes, upload patterns, access timestamps, and folder cardinality are visible to anyone with database access. This crypto layer encrypts file content and names; it does not pad sizes or randomize upload timing."

The deployed schema exposes additional metadata not covered by this four-item list:

Metadata	Where stored	What A1 learns
`name_hash` (HMAC-SHA-256 keyed by SHA-256(rk))	`files.name_hash`, `folders.name_hash` (every upload/rename/move)	Two files/folders share a name within a user's vault; no cross-user leakage (different rk → different HMAC key)
`share_events.viewer_country`	Per-share geo (derived from edge header or geo fallback)	Where shared-file recipients accessed from, per share
`share_events.viewer_ip` (HMAC-SHA-256 under server secret)	Per-share IP fingerprint; the secret (`SHARE_IP_HASH_SECRET`) is a server env var that A1 sees	Joinable across share events from the same IP within the secret's rotation period
`share_password_attempts` lockout state	Per `share_id`	How many failed attempts a share has, when last attempt occurred, whether it's locked
`files.download_count`	Per file	Number of times the file was downloaded
`subscriptions.last_active_at`	Per user	Last share-access activity timestamp

None of these break confidentiality. A1 cannot brute-force name_hash without rk; the IP HMAC is keyed by a server secret that A1 trivially holds (so it doesn't add protection against A1 — only against weaker adversaries) but still doesn't reveal plaintext IPs across rotations.

The threat model's four-item list reads as exhaustive when read by an external auditor. A user reading "file sizes, upload patterns, access timestamps, and folder cardinality" wouldn't realize that share viewer countries and per-share download attempt counts are also stored.

Reproduction or evidence. Grep the migrations:

$ grep -n "name_hash\|viewer_country\|viewer_ip\|share_password_attempts\|download_count\|last_active_at" \
    web/supabase/migrations/20250119000000_initial_schema.sql

returns the schema definitions for each item. Cross-reference with web/app/api/share/[slug]/route.ts:300-322 to see share_events writes including geo and IP-hash on every share access.

Recommendation. Expand the spec paragraph to:

"Metadata leakage at the storage layer. A non-exhaustive list of metadata visible to anyone with database access: file sizes, upload timestamps, access timestamps, folder cardinality, deterministic per-user filename hashes (HMAC-SHA-256 keyed by a per-user secret), per-share viewer geography and per-share hashed IP (HMAC-SHA-256 under a server secret), per-share lockout state and download counts, per-file owner download counts, and aggregate storage usage timestamps. The crypto layer encrypts file content and the ciphertext form of filenames; it does not encrypt access logs, search-hash material, or share metrics. Applications requiring metadata protection beyond filename-content secrecy must build it on top."

The list grows over time, so the "non-exhaustive" framing is more durable than enumeration.

Finding 3.7 — TLS termination boundary is implicit in the A2 "Cannot break TLS" assumption

Severity: Informational Affects: integration Component: spec (crypto/spec/threat-model.md)

Description. A2's "Cannot break TLS" is the threat model's network-confidentiality assumption. In practice, TLS terminates at the Vercel/Cloudflare edge, then a separate TLS session continues to Supabase / Backblaze. The AEAD layer is end-to-end client-to-blob, so the threat model's confidentiality story holds even against an attacker who controls the TLS termination point — they see ciphertext plus headers, exactly like A1. But the threat model doesn't say where TLS terminates, so the reader is left to infer it.

Not a security gap — the defense holds against A2 by construction — but the threat model should describe the trust boundary so the "Cannot break TLS" constraint is interpretable.

Reproduction or evidence. No code reference; this is a documentation observation.

Recommendation. Append to A2:

"TLS is terminated at the edge platform (Vercel/Cloudflare) and a fresh TLS session is used to talk to Supabase and B2. The AEAD layer is end-to-end client-to-blob, so an attacker controlling the TLS termination is effectively A1 (sees ciphertext, not plaintext). A2's 'cannot break TLS' is therefore a statement about the client↔edge leg specifically."

What I checked but did not find issues with

The following areas were reviewed against the threat model and the integration as shipped; no drift was found, or the drift is filed above.

A1 confidentiality mechanisms

No plaintext leaves the client. The v1 worker (web/utils/sfCryptoWorkerImpl.ts:213-600) encrypts in-process; nothing in installSfCryptoWorker writes plaintext to a network or storage destination. Encrypted bytes are emitted to the main thread as chunk messages and then uploaded via signed B2 URLs (Section "B2 direct upload" below).
B2 direct upload preserves the boundary. Large uploads use B2 pre-signed upload-part URLs handed to the client; the Next.js server never receives the ciphertext — web/app/api/files/upload-part-url/route.ts:73-87. The proxy path web/app/api/files/upload-part/route.ts forwards ciphertext only, with size + SHA-1 cross-checks; v1 uploads are explicitly rejected at the proxy (lines 161-169) and go direct-to-B2.
Wrapped keys are AES-GCM-wrapped under material the server never sees. csk_wrapped, fk_wrapped, rkWrappedByUk, rkWrappedByRec all use keys that are downstream of password (PBKDF2 → AES-GCM) or recovery key (random → AES-GCM), never server-side. The download endpoint at web/app/api/files/download/route.ts:93-109 returns the wrapped envelope; client unwraps locally.
Share password verification gates key delivery. The share GET endpoint at web/app/api/share/[slug]/route.ts:405-421 explicitly returns cskPwWrapped: null and friends; the verified password endpoint at web/app/api/share/[slug]/verify-password/route.ts is what delivers wrapped key materials after server-side bcrypt verification (out of crypto-layer scope).

A2 integrity mechanisms (v1 only — Finding 3.3 covers v0)

AEAD authentication on every chunk before plaintext emission (crypto/src/streams/aes-gcm-v1.ts:340-348).
Header MAC verified before any chunk processing (crypto/src/format/header.ts:201-214; Task 1 Findings 1.4a/b/c track ordering nuances, all fixed in crypto#2/crypto#3).
Truncation detection by both is_final AAD and plaintext_size header field + crypto/src/streams/aes-gcm-v1.ts:399-401 trailing-bytes check. Either alone suffices; both together is defense in depth.
Reorder detection by chunk_idx in AAD and IV bound to chunk_idx via nonce_prefix || uint64_be(i). Either alone suffices.
Cross-file splice resistance by HKDF salt = file_id on chunk_key and HKDF IKM = file_id on nonce_prefix (Finding 3.1 covers the prose mismatch).

A3 mechanisms in the library

ML-KEM-1024 keypair derivation is deterministic from rk. See crypto/src/suites/pq-hybrid-v1/index.ts:117-126 (deriveMlKemKeypair). The seed split convention is now spec-blessed in Task 2 Finding 2.4 → crypto#3.
Combined key K is HKDF over both shares with file_id salt. See crypto/src/suites/pq-hybrid-v1/index.ts:138-152 and Task 2's HKDF table (rows 5-8).
Decapsulation-before-MAC is structurally required and now spec-blessed. Task 1 Finding 1.4c → crypto#3 added the security argument.

These mechanisms exist and are correct in the library. Finding 3.2 documents that they are not yet exercised by the web app.

Cross-user attacks (mentioned in the task instructions, not in the threat model)

RLS policies enforce auth.uid() = user_id on every user-facing table. Verified for files, folders, subscriptions, bandwidth_usage, vault_keys in web/supabase/migrations/20250119000000_initial_schema.sql:1691-1718.
Service-role bypass is audited. Every withServiceRole(scope, ...) writes a row to service_role_audit (see web/utils/supabase/serviceRole.ts; the separate server-side review covers this in depth). No service-role-bypass code path I read reads or writes plaintext.
Per-user envelope keys. Folder/file keys are wrapped under the uploading user's keyring; the server cannot move a csk_wrapped from user A to user B and have B decrypt it because: (a) files.user_id is auth.uid()-locked via RLS, (b) even if A1 substitutes csk_wrapped bytes between users, B's unwrap fails (different wrapping key), and (c) even if A1 somehow planted a CSK and B unwrapped it, the file_id HKDF binding in chunk derivation would fail decryption. Three layers of mutual independence.

Server-side compromise scenarios

DB + B2 attacker (no password, no recovery key). Wrapped envelopes only; chain of unwraps requires rk; rk is not in the DB. No decryption possible. Confirmed via Task 2 § "Master-secret lifecycle".
DB + B2 + RAM scrape attacker. Out of scope for this task — the separate server-side review covers in-memory rk lifecycle and BFCache. The threat model's A1 explicitly excludes "compromise the user's device" which would include browser RAM.

MitM (TLS + integrity) — A2 specific

TLS termination is at the edge (Finding 3.7).
AEAD authentication catches any byte modification (per A2's defenses above).
Per Task 1 Finding 1.6 (fixed in web#123), cipher_version is thread-explicit so an A2 attacker cannot trigger version confusion.
v0/v1 isolation is structural (Task 1 Finding 1.11) — an A2 cannot re-route a v1 blob to the v0 decryptor and vice versa without AEAD failure.

Out-of-scope items remain out of scope under integration

Malicious client delivery. CSP enforced in production (web/middleware.ts); validated at boot (web/utils/validateEnv.ts:54-57). CSP is a host-app mitigation (out of crypto-layer scope); the crypto layer claims no defense.
Endpoint compromise. No integration change moved plaintext off the device. Master-key in-memory lifecycle is out of crypto-layer scope.
Side channels in WebCrypto / WASM. Argon2id timing via libsodium is data-independent for salt/output paths; bcrypt for share password is server-side and not user-controlled.
User key loss. Recovery-key + password independence still holds (Task 2 § "Master-secret lifecycle"); regenerating the recovery key re-wraps rk without rotating it (so files remain readable). Finding 3.4 expands this to recovery-key-compromise (vs. user-key-loss).

Out of scope (deferred to other tasks)

AEAD usage details (nonce reuse audit, AAD non-emptiness, tag-check-before-plaintext, wrong-key fail-fast) — Task 4.
Streaming invariants (single-frame-in-flight, content key zeroing, deferred-header-bytes deadlock prevention) — Task 5.
Upload-proof three-way cross-check for v0/v1 files — Task 6.
Server-side and web-application surfaces (share-password handling, share-link surface, service-role boundary, RLS, database constraints, RPC authorization, CSP and XSS surface, master-key in-memory handling, session management, recovery-key UX, logging and PII) are out of crypto-layer scope and covered by the separate server-side review, kept private.

References

crypto/spec/threat-model.md
crypto/spec/format-v0.md
crypto/spec/format-v1.md
crypto/spec/key-derivation.md
Internal review prior tasks (inputs):
@shieldfive/crypto 1.0.0-alpha.3, repo HEAD a5e99ac:
web app, repo HEAD bfa41a57:
- web/utils/sfCryptoWorkerImpl.ts
- web/utils/sfCryptoDecrypt.ts
- web/utils/cryptoV1Flag.ts
- web/utils/keyring.ts
- web/utils/keyCrypto.ts
- web/utils/vaultKeyClient.ts
- web/utils/metadataClient.ts
- web/utils/sharePassword.ts
- web/middleware.ts (CSP enforcement)
- web/app/api/files/create-upload-session/route.ts
- web/app/api/files/upload-part/route.ts
- web/app/api/files/upload-part-url/route.ts
- web/app/api/files/download/route.ts
- web/app/api/share/[slug]/route.ts
- web/app/api/share/[slug]/verify-password/route.ts
- web/utils/storage/backblaze.ts
- web/utils/storage/backblazeLarge.ts
Supabase migrations:
- web/supabase/migrations/20250119000000_initial_schema.sql (RLS policies, share tables, name_hash)
- web/supabase/migrations/20260509213345_extend_cipher_version_for_v1.sql (v1 introduction date — Finding 3.3)
- web/supabase/migrations/20260510211958_finalize_upload_accepts_v1_proof.sql

Task 3 — Threat model review

Task 3 — Threat model review against current implementation

Summary

Threat coverage table

Protected threats

Excluded threats (out of scope)

Threats not in the spec adversary model

Findings

Finding 3.1 — Threat model A2 describes cross-file splice resistance as "file_id AAD" but the mechanism is HKDF salt = file_id

Finding 3.2 — A3 protection is documented via suite 0x03 but the web app does not yet emit 0x03

Finding 3.3 — Threat model omits v0 acknowledgement; v0 production data lacks v1's AEAD-bound integrity invariants

Finding 3.4 — Recovery-key compromise is not in the adversary model

Finding 3.5 — Share-link recipients are a partial-access party the threat model doesn't enumerate

Finding 3.6 — "Metadata leakage at the storage layer" understates what the deployed schema exposes

Finding 3.7 — TLS termination boundary is implicit in the A2 "Cannot break TLS" assumption

What I checked but did not find issues with

A1 confidentiality mechanisms

A2 integrity mechanisms (v1 only — Finding 3.3 covers v0)

A3 mechanisms in the library

Cross-user attacks (mentioned in the task instructions, not in the threat model)

Server-side compromise scenarios

MitM (TLS + integrity) — A2 specific

Out-of-scope items remain out of scope under integration

Out of scope (deferred to other tasks)

References

Finding 3.2 — A3 protection is documented via suite `0x03` but the web app does not yet emit `0x03`