Loocero — Privacy Architecture
Status as of M12P.1. This document codifies the privacy contract Loocero ships under. It is the authoritative reference for what data the hosted SaaS holds, what it never holds, and how the boundary is enforced.
Companion docs: DATA_RETENTION.md, DATA_MAP.md, production-hardening.md §8.
1. The seven non-negotiable rules
These rules are the contract. No feature, marketing claim, or operational change may violate them.
- Raw uploaded import files are deleted automatically after successful import finalization.
- Abandoned uploads and parser staging artifacts are auto-deleted on a short retention window.
- Hosted SaaS stores only normalized financial records and minimal import metadata.
- Hosted SaaS stores no saved notes.
- Hosted SaaS stores no AI chat history, summaries, or memory.
- AI chat is real-time only. Users may export chats locally to their own device, but nothing is persisted in Loocero.
- Internal logging, analytics, and error reporting do not capture financial payloads or chat content.
Note (2026-05-02 licensing pivot): an earlier revision of this doc included an eighth rule — "Self-hosting remains the privacy-max path for users who want full infrastructure control." — dating from a planned open-source release. Loocero now ships as proprietary, hosted SaaS (see
NOTICEat the repo root) and does not offer a default self-hostable build. The eighth rule was dropped to keep the contract truthful: every rule above is enforced today by the hosted product code. Private self-hosted deployments may be offered in the future under a commercial agreement, but that is a sales arrangement, not a default product guarantee, and does not belong in this contract.
2. Enforcement gate (M12P)
M12P is the gate milestone for privacy claims. No public marketing language may state, suggest, or imply any of the seven rules until M12P.4 is complete and merged. That includes the Loocero landing page, README hero copy, social posts, and pricing-page bullets.
Sequence:
| Sub | Scope | Status gate for marketing? |
|---|---|---|
| M12P.1 | Codify the contract (this doc + DATA_RETENTION + DATA_MAP + production-hardening §8 + memory rebrand) | Docs only — claims still embargoed |
| M12P.2 | Remove /api/chat server-side persistence + add local Markdown/JSON export |
Rule 5/6 partially enforced (writes stop, tables remain) |
| M12P.3 | Drop conversations + messages tables via migration |
Rule 5/6 fully enforced |
| M12P.4 | Sentry beforeSend scrubber + tightened lib/observability/log.ts allow-list |
Rule 7 enforced — marketing embargo lifts here |
| M12P.5 | Drop transactions.raw_row jsonb column |
Rule 3 fully enforced |
Until M12P.4 ships to production, the public story is "Loocero is being built privacy-first" — present tense, not "stores no chat history" past tense. The truthful claim is gated on enforcement, not documentation.
3. Trust boundaries
Loocero is a Next.js + Supabase app. Three concentric trust zones:
┌─────────────────────────────────────────────────────────┐
│ Browser (untrusted) │
│ - User input, Plaid Link iframe, AI chat input │
│ - Holds session cookie (httpOnly, Secure, SameSite) │
│ - In-memory chat state (cleared on tab close) │
└──────────────────┬──────────────────────────────────────┘
│ TLS 1.3
┌──────────────────▼──────────────────────────────────────┐
│ Vercel runtime (semi-trusted, ephemeral) │
│ - Server Actions + Route Handlers │
│ - In-memory CSV/PDF parsing — never persists files │
│ - OPENAI_API_KEY, PLAID_*, SUPABASE_* in env only │
│ - No filesystem writes (read-only Lambda) │
└──────────────────┬──────────────────────────────────────┘
│ TLS 1.3 + Supabase JWT (RLS-scoped)
┌──────────────────▼──────────────────────────────────────┐
│ Supabase Postgres (tenant-isolated, at-rest encrypted) │
│ - Normalized financial rows, RLS by user_id │
│ - Plaid access tokens via pgcrypto (M12 onward) │
│ - No raw upload files, no chat history (post-M12P.3) │
└─────────────────────────────────────────────────────────┘
Service-role key is never used by the app runtime. Only the integration test harness instantiates an admin client, and only for auth.admin.createUser/deleteUser lifecycle. See docs/production-hardening.md §2.
4. Data flows
4.1 CSV / PDF import (rules 1, 2, 3)
File picked in browser
│ multipart/form-data POST to Server Action (parseCsvFile / pdf route)
▼
Server Action receives `File` blob
│ reads via file.text() / unpdf into memory
│ parses → headers + rows array
│ enriches with merchant matches (read-only DB lookups)
▼
Returns ParseResult to browser ◄── file.text() result is GC'd as soon as
the action returns. No tmp/, no Storage,
no disk write at any stage.
│ user reviews preview, clicks Import
▼
Server Action importTransactions(payload)
│ classifies rows against existing transactions (RLS-scoped)
│ inserts imports row (status = PROCESSING)
│ bulk-inserts transactions
│ updates imports row (status = DONE, outcome = counts only)
▼
Done. Browser shows summary.
Why this satisfies rule 1: the source File object is a request-scoped Web API value. After the action returns its ParseResult, no reference to the original bytes survives. Vercel's Lambda filesystem is read-only outside of /tmp, and Loocero never writes to /tmp. Confirmed by inspection of src/lib/actions/import.ts and src/app/api/import/pdf/route.ts.
Why this satisfies rule 2: there is no staging table, no Supabase Storage bucket, no scheduled job that purges anything — because nothing is ever written. The "abandoned upload" case collapses to "the user closed their tab," which leaves zero artifacts behind.
Why this satisfies rule 3: post-M12P.5, the transactions table holds only typed columns (date, description, amount, type, currency, category_id, ...). The raw_row jsonb column was dropped in migration 20260502010000_drop_transactions_raw_row.sql. The imports row holds only filename (user-supplied label), status, row_count, and outcome (typed integer counts).
4.2 AI chat (rules 5, 6, 7)
Post-M12P.3:
User types a message in /chat
│ in-memory message[] state in ChatShell
▼
POST /api/chat { messages: [...last 10] } ◄── nothing in the browser
│ auth check (Supabase JWT) state is persisted to
│ buildFinancialContext(userId) localStorage or IndexedDB
│ → reads from transactions, accounts,
│ budgets, etc. via RLS
│ system = context + GUARDRAILS
▼
streamText(model, system, messages)
│ tokens stream back to browser
▼
Stream closes. onFinish runs no DB writes. ◄── post-M12P.2
No conversations row.
No messages row.
No "memory."
│
▼
Browser holds the conversation in React state.
User can hit Reset (clears state) or Export (downloads .md or .json).
Closing the tab discards the conversation forever.
Why this satisfies rules 5/6: after M12P.3, the conversations and messages tables do not exist. There is no place for chat to persist server-side. The onFinish hook in /api/chat does not write. Browser state is useState only — no localStorage, no IndexedDB.
Why this satisfies rule 7 (post-M12P.4): Sentry beforeSend scrubs financial-payload-shaped strings from event payloads before they leave the runtime. lib/observability/log.ts uses an allow-list (caller passes typed context; everything else is dropped). Stack-trace strings from Postgres errors that may contain row data are scrubbed in beforeSend by regex match on description / amount / date patterns.
4.3 Plaid bank sync (rule 3, plus encryption-at-rest commitment)
User clicks "Connect bank" on /accounts
│ POST /api/plaid/link-token → returns Plaid link_token
▼
react-plaid-link iframe (Plaid-hosted, isolated origin)
│ user enters bank credentials INSIDE Plaid's UI — Loocero
│ never sees the username or password
▼
Plaid issues public_token to the browser
│ POST /api/plaid/exchange { public_token }
▼
Server: itemPublicTokenExchange(public_token) → access_token
│ pgp_sym_encrypt(access_token, env.PLAID_TOKEN_ENC_KEY)
│ insert into institutions (plaid_access_token_enc bytea)
▼
Done. access_token never returns to browser, never logs.
Subsequent sync: /api/plaid/sync decrypts the access token in-memory, calls transactionsSync, maps results through the existing M11 import pipeline (one imports row per sync run, source = 'plaid'), writes typed transactions rows. The decrypted token is GC'd at the end of the request.
The PLAID_TOKEN_ENC_KEY is a Vercel-side env secret. Rotation procedure documented in docs/production-hardening.md §9 (added in M12.B.1).
5. Encryption commitments
| Layer | What | How |
|---|---|---|
| In transit | All client ↔ server traffic | TLS 1.3 (Vercel + Supabase enforced) |
| At rest, disk-level | All Postgres data, all Supabase Storage (we use none) | AWS-managed, AES-256, Supabase platform default |
| At rest, column-level | Plaid access_token + future BYOK openai_api_key |
pgcrypto symmetric (pgp_sym_encrypt/decrypt) with env-held key |
| Backups | Daily logical backup (Free tier = 7d) | Encrypted at rest by Supabase platform |
Not in scope: end-to-end encryption (would require client-side key management; defers AI features).
6. User rights
| Right | Mechanism |
|---|---|
| Export financial data | /settings → Download CSV (planned, M14 or earlier) |
| Export chat | In-chat Export button — Markdown by default, JSON via secondary link. Client-side only, no network round-trip |
| Delete account | /settings → Delete account → cascades through users FK to wipe all tenant data atomically |
| Access disclosure (GDPR Art. 15) | Same export mechanism above; tenant has full read access via RLS |
| Rectification | Inline edit on any record (transactions, budgets, etc.) |
| Restrict processing | Disconnect Plaid institutions; revoke OpenAI API key (when BYOK ships in M16) |
All rights above apply to every Loocero customer regardless of tier or pricing plan.
7. Change control
This document and its companions (DATA_RETENTION.md, DATA_MAP.md) are part of the privacy contract. Changes require:
- PR with rationale in the description, linking the user-visible feature that motivates the change.
- Update to
production-hardening.md§8 if the enforcement surface changes. - Memory entry update in
project_loocero_privacy.md. - If a rule is materially weakened: explicit changelog entry in
CHANGELOG.md(file added in M14 or earlier).
Adding stricter rules is a regular PR. Loosening any of the seven rules is a project-instructions-level decision and lives outside this doc.