Skip to content

Module Spec: FintraOS Vault

1. Overview

FintraOS Vault is designed to be the "Black Box" of the platform. It will be responsible for the long-term, immutable storage of all financial history and the operational observability of the system.

While Core manages the current state, Vault will manage the history. It answers the question: "What did this user's profile look like on January 1st, 2020?"

Core Responsibilities: 1. Immutable Event Store: The source of truth for every change in the system (see ADR-001). 2. Time Travel Debugging: Reconstructing past states for audit or support. 3. Data Lineage: Tracking where a data point came from (e.g., "Was this category set by Plaid, the ML Model, or the User?"). 4. Feed Health: Monitoring the latency and quality of connections.


2. Core Entities

2.1. EventRecord

Will be the atomic unit of storage.

{
  "event_id": "evt_99999",
  "stream_id": "profile_usr_123", // The Aggregate ID
  "tenant_id": "org_neobank",
  "event_type": "TransactionCategorised",
  "version": 15, // Optimistic concurrency control
  "timestamp": "2023-10-27T14:30:00Z",
  "causation_id": "cmd_888", // The command that triggered this
  "correlation_id": "req_777", // The original API request ID
  "actor": {
    "type": "SYSTEM",
    "id": "service_brain" // or "user_123"
  },
  "payload": {
    // Encrypted by Guard Sidecar before writing
    "transaction_id": "tx_abc",
    "old_category": "Uncategorised",
    "new_category": "Groceries",
    "confidence": 0.95
  }
}

2.2. Snapshot

An optimisation to avoid replaying 1 million events.

{
  "snapshot_id": "snap_500",
  "stream_id": "profile_usr_123",
  "version": 1000,
  "state": { ...full_profile_json... }
}

2.3. FeedHealthMetric

Will track the reliability of external aggregators.

{
  "metric_id": "met_123",
  "provider": "PLAID_US",
  "institution_id": "ins_chase",
  "latency_ms": 450,
  "success_rate": 0.99,
  "error_count": 2,
  "window_start": "2023-10-27T14:00:00Z"
}


3. Architecture

3.1. Storage Tiers

Vault will utilise a tiered storage architecture to balance cost and speed. * Hot Store (PostgreSQL): Recent events (last 90 days). Optimised for transactional writes and immediate reads. * Warm Store (TimescaleDB): Time-series data (balances, stock prices) for the last 2 years. * Cold Store (S3 / Parquet): Archived events older than 2 years. Cheap, immutable, used for training ML models.

3.2. The "Replay" Engine

Will allow developers to "Time Travel." * Input: profile_id, target_date. * Process: 1. Find the nearest Snapshot before target_date. 2. Load Snapshot. 3. Replay all EventRecords from Snapshot Version up to target_date. * Output: The exact JSON state of the profile at that moment.

3.3. Lineage Tracker

Every field in the Read Model will include a "Source" tag derived from Vault. * category: "Groceries" * _meta.category.source: "MODEL_V2" * _meta.category.updated_at: "2023-10-01"

This is critical for debugging "Why did the AI tag this wrong?"


3.4. The Simulation Engine (What-If Scenarios)

Will be a powerful B2B sales tool allowing tenants to simulate logic on historical data. (Consumed by Simulation View).

  • Endpoint: POST /v1/vault/simulate
  • Payload:
    {
      "logic": "IF transaction.category == 'Coffee' AND amount > 3.00 THEN savings += (amount - 3.00)",
      "time_range": { "start": "2023-01-01", "end": "2023-12-31" }
    }
    
  • Process: Replays the user's history through a "Virtual Brain" running the new logic.
  • Response: { "actual_savings": 0, "simulated_savings": 450.50 }
  • Value: "Show, don't just tell" value proposition for prospective customers.

4. Compliance & Audit

4.1. The "Audit Log" View

Will be a developer-facing API to show end-users who did what. * GET /v1/vault/audit?profile_id=usr_123 * Response: * "User logged in (IP 1.2.3.4)" * "User linked Barclays Account" * "System detected Salary" * "Support Agent X viewed Profile (Reason: Ticket #555)"

4.2. Crypto-Shredding Verification

When Guard deletes a key, Vault will: 1. Run a verification job. 2. Attempt to decrypt a random sample of events for that user. 3. If decryption fails (as expected), it logs a Compliance.ErasureVerified event.


5. Diagnostics API (Internal)

5.1. GET /v1/vault/health/providers

Will return the real-time status of bank connections. * "Chase US: Degraded (High Latency)" * "Revolut UK: Operational"

5.2. POST /v1/vault/replay

Will trigger a replay of events to fix corrupted state. * Use Case: We deployed a bug in the "Net Worth" calculator. We fix the bug, then ask Vault to "Replay all events for Tenant X" to repair their dashboards.

5.3. POST /v1/vault/simulate

Will allow tenants to run "What-If" scenarios on historical data to prove value to their users. * Use Case: "If you had used our 'Round-Up' feature last year, how much would you have saved?" * Process: 1. Clone the user's event stream to a temporary sandbox. 2. Inject a new LogicRule (e.g., "Round up every transaction"). 3. Replay the stream. 4. Compare the End State (Sandbox) vs Actual State (Live). * Output: { "actual_savings": 0, "simulated_savings": 450.50 }