Module Spec: FintraOS Vault
1. Overview
FintraOS Vault is designed to be the "Black Box" of the platform. It will be responsible for the long-term, immutable storage of all financial history and the operational observability of the system.
While Core manages the current state, Vault will manage the history. It answers the question: "What did this user's profile look like on January 1st, 2020?"
Core Responsibilities: 1. Immutable Event Store: The source of truth for every change in the system (see ADR-001). 2. Time Travel Debugging: Reconstructing past states for audit or support. 3. Data Lineage: Tracking where a data point came from (e.g., "Was this category set by Plaid, the ML Model, or the User?"). 4. Feed Health: Monitoring the latency and quality of connections.
2. Core Entities
2.1. EventRecord
Will be the atomic unit of storage.
{
"event_id": "evt_99999",
"stream_id": "profile_usr_123", // The Aggregate ID
"tenant_id": "org_neobank",
"event_type": "TransactionCategorised",
"version": 15, // Optimistic concurrency control
"timestamp": "2023-10-27T14:30:00Z",
"causation_id": "cmd_888", // The command that triggered this
"correlation_id": "req_777", // The original API request ID
"actor": {
"type": "SYSTEM",
"id": "service_brain" // or "user_123"
},
"payload": {
// Encrypted by Guard Sidecar before writing
"transaction_id": "tx_abc",
"old_category": "Uncategorised",
"new_category": "Groceries",
"confidence": 0.95
}
}
2.2. Snapshot
An optimisation to avoid replaying 1 million events.
{
"snapshot_id": "snap_500",
"stream_id": "profile_usr_123",
"version": 1000,
"state": { ...full_profile_json... }
}
2.3. FeedHealthMetric
Will track the reliability of external aggregators.
{
"metric_id": "met_123",
"provider": "PLAID_US",
"institution_id": "ins_chase",
"latency_ms": 450,
"success_rate": 0.99,
"error_count": 2,
"window_start": "2023-10-27T14:00:00Z"
}
3. Architecture
3.1. Storage Tiers
Vault will utilise a tiered storage architecture to balance cost and speed. * Hot Store (PostgreSQL): Recent events (last 90 days). Optimised for transactional writes and immediate reads. * Warm Store (TimescaleDB): Time-series data (balances, stock prices) for the last 2 years. * Cold Store (S3 / Parquet): Archived events older than 2 years. Cheap, immutable, used for training ML models.
3.2. The "Replay" Engine
Will allow developers to "Time Travel."
* Input: profile_id, target_date.
* Process:
1. Find the nearest Snapshot before target_date.
2. Load Snapshot.
3. Replay all EventRecords from Snapshot Version up to target_date.
* Output: The exact JSON state of the profile at that moment.
3.3. Lineage Tracker
Every field in the Read Model will include a "Source" tag derived from Vault.
* category: "Groceries"
* _meta.category.source: "MODEL_V2"
* _meta.category.updated_at: "2023-10-01"
This is critical for debugging "Why did the AI tag this wrong?"
3.4. The Simulation Engine (What-If Scenarios)
Will be a powerful B2B sales tool allowing tenants to simulate logic on historical data. (Consumed by Simulation View).
- Endpoint:
POST /v1/vault/simulate - Payload:
- Process: Replays the user's history through a "Virtual Brain" running the new logic.
- Response:
{ "actual_savings": 0, "simulated_savings": 450.50 } - Value: "Show, don't just tell" value proposition for prospective customers.
4. Compliance & Audit
4.1. The "Audit Log" View
Will be a developer-facing API to show end-users who did what.
* GET /v1/vault/audit?profile_id=usr_123
* Response:
* "User logged in (IP 1.2.3.4)"
* "User linked Barclays Account"
* "System detected Salary"
* "Support Agent X viewed Profile (Reason: Ticket #555)"
4.2. Crypto-Shredding Verification
When Guard deletes a key, Vault will:
1. Run a verification job.
2. Attempt to decrypt a random sample of events for that user.
3. If decryption fails (as expected), it logs a Compliance.ErasureVerified event.
5. Diagnostics API (Internal)
5.1. GET /v1/vault/health/providers
Will return the real-time status of bank connections. * "Chase US: Degraded (High Latency)" * "Revolut UK: Operational"
5.2. POST /v1/vault/replay
Will trigger a replay of events to fix corrupted state. * Use Case: We deployed a bug in the "Net Worth" calculator. We fix the bug, then ask Vault to "Replay all events for Tenant X" to repair their dashboards.
5.3. POST /v1/vault/simulate
Will allow tenants to run "What-If" scenarios on historical data to prove value to their users.
* Use Case: "If you had used our 'Round-Up' feature last year, how much would you have saved?"
* Process:
1. Clone the user's event stream to a temporary sandbox.
2. Inject a new LogicRule (e.g., "Round up every transaction").
3. Replay the stream.
4. Compare the End State (Sandbox) vs Actual State (Live).
* Output: { "actual_savings": 0, "simulated_savings": 450.50 }